DUDLEY KNOX LIBRARY 
l‘TA''^'VL POSTGRADUATE SCHOOL 
MOlIlEltLY. 3.0.J.IPODl‘'^iA D3S43-500S 



} 




NAVAL POSTGRADUATE SCHOOL 



DESIGN AND IMPLEMENTATION OF A 
"C" COMPILER FOR AN ABSTRACT MACHINE 

by 

Metin Gursel OZISIK 
June 1986 

Thesis Advisor: Daniel L. Davis 

Approved for public release; distribution is unlimited. 



Monterey. California 




THESIS 




'£CUftirv CLASSIFICATION 6f THIS PAGE 



REPORT DOCUMENTATION PAGE 



a REPORT SECURITY CLASSIFICATION 

UNCLASSIFIED 



lb. RESTRICTIVE MARKINGS 



a SECURITY CLASSIFICATION AUTHORITY 



b DECLASSIFICATION /DOWNGRADING SCHEDULE 



3 DISTRIBUTION/ AVAILABILITY OF REPORT 

Approved for public release; 
distribution is unlimited 



PERFORMING ORGANIZATION REPORT NUMBER(S) 



S. MONITORING ORGANIZATION REPORT NUM3ER(S> 



a. NAME OF PERFORMING ORGANIZATION 

aval Postgraduate School 



6b OFFICE SYMBOL 
(If applicat^e^ 



7a. NAME OF MONITORING ORGANIZATION 

Naval Postgraduate School 



c. ADDRESS {City, State, and ZIP Code) 

onterey, CA 93943-5000 



7b. ADDRESS (Ofy, Sfafe, and ZIP Code) 

Monterey, CA 93943-5000 



a NAME OF FUNDING /SPONSORING 
ORGANIZATION 



8b. OFFICE SYMBOL 
(If applicable) 



9. PROCUREMENT INSTRUMENT IDENTIFICATION NUMBER 



c. ADDRESS fOfy. State, and ZIP Code) 



10 SOURCE OF FUNDING NUMBERS 



PROGRAM 


PROJECT 


TASK 


WORK UNIT 


ELEMENT NO. 


NO 


NO 


ACCESSION NO 



1 TITLE (Include Security Clarification) UNCLASSIFIED 

esign and Implementation of a C Compiler for an Abstract Machine 



2 personal AUTHOR(S) 



3a TYPE OF REPORT 


13b TIME COVERED 


14 DATE OF REPORT {Year, Month, Day) 


IS PAGE COUNT 


Masters Thesis 


FROM TO 


1986 June 20 


109 • 



s supplementary notation 



/ COSATI COOES 


FIELD 


GROUP 


SUB-GROUP 















18 SUBJECT TERMS {Continue on reverse if necessary and identify by block number) 

"C" Compiler 



5 abstract {Continue on reverse if necessary and identify by block number) 

The technique of formal abstraction provides an appropriate tool for 
specifying an interface between layers of computer hardware and 
software. An abstract machine called AM has been built to address 
the problem of portability and reusability of software. This thesis 
is the design and implementation of a "C" Compiler for this abstract 
machine. 



21 ABSTRACT SECURITY CLASSIFICATION 


UNCLASSIFIED 




22b TELEPHONE f/nc/ode Arta Code) 


22c OFFICE SYMBOL 


408 646-3091 


52Dv 



0 D'STmauTlON/ AVAILABILITY OF ABSTRACT 

^ UNCLASSIFIED/UNLIMITED □ SAME AS RPT □ OTIC USERS 



2a .NAME OF RESPONSIBLE INDIVIDUAL 

Prof. Daniel L. Davis 



DFORM 1473, 84 mar 



B3 APR edition may be used until exhausted 
All other editions are obsolete 
1 



SECURITY CLASSIFICATION OF THIS PAGE 



ftpproved for public release, distribution unlimited 



Design and Implementation of a C Compiler 
for an Abstract Machine 



by 



Met in Gursel Osisik 
Ustegmen, Turkish Navy 
B. S. , Turkish Naval Academy, 198® 

Submitted in partial fulfillment of the 
requirement for the degree of 

MASTER OF SCIENCE IN COMPUTER SCIENCE 

from the 

NAVAL PASTGRADUATE SCHOOL 
June 1986 



/ 



/\ ;l I\ 



ftBSTRftCT 



The technique of formal abstraction provides an 
appropriate tool for specifying an interface between layers 
of computer hardware and software- An abstract machine called 
AM has been built to address the problem of portability and 
reusability of software. This thesis is the design and 
implementat ion of a '*C*' compiler for this abstract machine. 
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I. INTRODUCTION 



In today’s computer world, portability is a well-known 
problem which arises in a variety of situations. Since 
computer software evolves in connection with a particular 
hardware environment, and often assumes features closely 
related to charact er i st i cs of its own hardware, this problem 
has been unavoidable. 

Formalizing the relationship between hardware and 
software resources was treated is a previous NPS thesis by 
Yurchak CRef. U , whose efforts resulted in the specification 
and imp lenient at ion of an abstract machine, called AM. 

The abstraction of a bit mapped display resource was 
added to AM in another NPS thesis by Hunter. CRef. ElD 

Finally, an abstraction of a formally specified reusable 
database was added to the same machine by Zang. CRef. 32 

This presentation is a further extension of the work 
started by Yurchak and Hunter: An abstract computer and its 
programming environment . Its major objective is a compiler 
for a subset of the C language for AM. 

A. THE PORTABILITY PROBLEM 

It is well-known that moving large programs from one 
machine to another is frustrating work. And it is also known 
that once the software has been moved to the new machine, it 
is not predictable whether or not it will work as before. 
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Even if it seems to work, it may consume more resources than 
expected- 

For a couple of reasons, the pov'tabi 1 i ty problem is 
getting worse, not better: 

- Computer arch i t ect ures have been changed to make them 
look like what the programmer wants 

- The number of the devices included in modern 

arch i t ect ures has been maximized 

- Both languages and machines are related to the data they 
manipulate in an implement at ion dependent way 

These and other factors make the portability problem a 

difficult task, and in addition, they affect some other 

difficult issues like language design and software 

eng ineer ing- 

B- CURRENT I MPLEMENTAT I ONS TO SOLVE PORTABILITY PROBLEM 

The usage of high level languages provides a degree of 
high level abstraction, and provides some measure of software 
st andard i zat ion and portability- But the portability of high 
level languages is limited, since all the layers of software 
below this high level have to be moved, in order to port such 
a systern- 

There are other abstraction levels between the computer 
hardware and the application environment s- Especially 
operating systems represent a software abstraction of 
physical resources, and support the layers of software built 
over this level- Starting with CP/M and UNIX, we have seen 
some good imp 1 ernent at i ons that provide such an abstract 
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level to some degree, 
abstract and formal 
in typical computing 



The main idea of the AM machine is 
y define other physical resources f«: 
systems. 



t o 
:«und 
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II. ftBSTRftCT MftCHINE. PiM 



The Abstract Machine (ftM) is a result of Yurchak CRef- ID 
and Hunter’s CRef. clJ efforts to solve the problem of 
formalizing the relationship between hardware and software 
resources. It is implemented as a finite state machine 
interpreter, with an assembler. Details of the newest version 
of the ftM assembler can be found in Zang’s CRef. 3D thesis. 

"ftbst ract ion" describes the separation of the defining 
properties of an object from other, unnecessary details about 
it. ft programmer is primarily concerned with solving a 
problem. ftppropr i at e ly , the tools at his disposal, such as 
programming languages, development aids, and the programming 
environment, form a problem solving abstraction. The hardware 
(and some of the software) on which this problem solving 
abstraction is implemented, however, is an abstraction of a 
different sort. 

The fuzzy area between software and physical resource 
abst ract ions, sometimes sirnpl ist ical ly perceived as the 
boundary between hardware and software, exposes a number of 
shortcomings in language design and computer architecture 
collectively termed the "semantic gap". 

Narrowing the semantic gap requires significant changes 
in the fundamentals of computer architecture and language 
design. Three major factors which si gni f icant ly contribute to 
this problem are: 
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Informal ly described - sernant ics ; 



— Represent at ion dependent data types; 

- Arbitrarily designed instruction set arch i t ect ures. 

The AM was designed to fill this semantic gap by 
addressing the above problems. CRef. 13. 

In the AM irnplernentat ion, a text file representing an 
assembly language program is translated by the assembler into 
a relocatable object module. A loader, part of the AM 
interpreter, loads this object module into the appropriate 
cells, and AM executes it. 

The following presentation is an irnplernentat ion ^ of a 
subset of the high level language "C", for that abstract 
machine. It is a compiler which compiles C source code and 
generates assembler source code for the AM. 
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III. DISCUSSION OF SUBSET 



Since cornmerc i a 1 1 y good compilers are very large programs 
and it takes on the average six man-years to write one of 
them, this research work had to be a small subset of the C 
1 ang uage. 

The goal was to write a small portion of C in the C 
language itself, and then by feeding the output of this work 
into itself, to create a native code C compiler. 

Since this work was going to be a race against time, the 
subset had to be as small as possible, but on the other side, 
had to be large enough to be able to compile its own source 
code. 

The sub-goal was to use a strictly limited number of 
features to write the compiler, because any new feature used 
in the code would require i rnp 1 ernent at i on of the same feature 
in the cornpi ler. 

The outcome of this work was not sophisticated enough to 
compile itself. It evolved as a small subset of the C 
programming language, so called “Tiny-C**. And since it was 
not sufficient to compile its own code, it is used as a 
cross-comp i 1 er from host MS-DOS computers to the target 
machine AM. 

A. TINY-C SUBSET 

Tiny-C is a small subset of C, and a thesis project 
more than a language. There are many features which a real 
programming language has to have, but Tiny— C does not. 



The Tiny~C compiler was written in five months and is 
considered to have the fundamental structure of a real 
cornpilero Hopefully it will be modified and improved in the 
future, and may be usable for real app 1 icat ions- 

Pppendix A is a listing of the Tiny~C l^^nguage grammar. 
But this grammar is obviously not the complete "C‘ 

language. At least: 

~ Structure and union specifiers are not included. 

~ Functions are not allowed to return addresses. 

~ Assignments inside the expressions are not allowed, 
because they were considered as making progr'arns 

'* unreadable" . For inst ance : 

"if < <Joo- JlmrnyH-S) > S )" is not allowed in Tiny-C. 

- Multiple assignments are not implemented. For instance: 
Jimmy ■ 15 * m«ry|" is an invalid statement in 

T i ny-C. 



B. THE TINY-C COMPILER 

Even though the Tiny-C language subset was planned within 
the limits in this thesis, the Tiny-C compiler can only 
compile and generate code for an even smaller subset of the 
above grammar. 

The Tiny~C compiler irnplernentat ion can parse the whole 
Tiny— C subset and give proper error messages if necassary. 
But , 



- due to the time constraints, and 

- due to the restricted capabilities of the target AM 
mach i ne 



1 £ 



the Tiny-C compiler cannot generate code for the whole Tiny-C 
language- 

In the Tiny-C compiler; 

- Floating point arithmetic is not implemented. Because it 
is not supported by AM. 

- Bitwise and shift expressions are not implemented, since 
they are not supported by AM. 

- Since AM has strictly defined data types and does not 
allow type conversions, address, pointer and array types 
are not implemented. 

- Since AM is designed as an operating system independent 
software machine, the "#include‘' preprecessor is not 
implemented. 



Since AM does not have a linker 
declarations are not implemented. 



yet, 



externa 1 



Auto, static, 
i rnp 1 ernent ed . 



reg i st er , 



bool ean 



types 



are not 
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IV. THE DESIGN 



This chapter describes the Tiny-C compiler step by step. 
Bat obviously the purpose of this present at i on is not to 
teach the '•compiler writing art", or to explain the target 
Abstract Machine’s assembler. Complete document at ion for the 
AM Assembler can be found in Yurchak’ s thesis CRef. 11 . For a 
better underst and i ng of the following structures, Ullmann’s 
"Compilers, Techniques and Tools" CRef. 4] is recommended as 
a background reference for compiler writing. 

The Tiny-C Compiler is written in nine steps. These are: 

- Scanner or Lexical Analyzer 

- Grammar 

- Recursive Descent Parser with Backtracking 

- Data Structures for the Parser 

- Error Checking and Error Messages 

- Emission of Intermediate Code 

- Int errned i at e Code Optimization 

- Data Structures for the Code Generator 

- Target Code Generation 

Ule will first go through these steps briefly in order to 
get acquainted with the architecture of the Tiny-C compiler. 

A. SCANNER AND LEXICAL ANALYZER 

In general, scanners and lexical analyzers are language 
independent structures. The same scanner may be used for a 
couple of different compilers. For this reason we will 
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introduce this structure even before discussing the Tiny-C 
grarnrnar- 

Contrary to the header of this section, Tiny~C does not 
have a scanner or lexical analyzer in the classical sense. 

Even though the most common way of writing compilers is 
analyzing the input data stream lexically, and after 
tokenizing, passing tokens to the parser as they are needed. 



this was 


not 


the way 


scanning was implemented 


i n 


this 


comp i 1 er- 


The 


T iny-C 


is made up of a c>: 


:• u p 1 e 


of 


rout i nes 


used 


by a 


recursive descent parser 


with 


a 



backtracking tool. There is no tokenized data strearn- 

The idea is to read the input stream into a scanner 
buffer, (which is implemented as a ring buffer) and parse 

it there- This technique gives an ability to backtrack and 
makes it possible to write a very simple recursive descent 
top-down parser. With such a backtracking tool, the grammar 
does not need to be massaged to a fully LL ( 1 ) grammar, that 
is even if it is ambiguous in the LL ( 1 ) sense. In any 
ambiguous case, the parser can try all possible options by 
bactracking. 

Let’s start by introducing our scanner buffer 

and its init ial izat ion. 

init_buf(> /* initialize scanner buffer */ 

-C 

Reads input source file into scanner buffer- Sets the 
pointers for the current place (for initializing procedure, 
it is simply the beg ini ng of the scanner buffer) and for the 
very last character in the sc3.rtrtBr buffer. 

> 
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The scanner may or may not read the whole input stream 



at once, because its ring buffer has a limited size. Now the 
next question is how to get a character from this buffer 
(since tokens are not used, we have to deal with 
characters) , and if it is the end of the characters, how to 
read some more input into this ring buffer. 



char g»tchr() /* get character routine */ 

-C 

Gets the next character from scanner buffer, and load 
it into global "nextch". If .it reaches the current end o 
the scanner buffer, it reads some more text from the source 
into scanner buffer. If it meets the end of file character, 
it sets the "fi legend" flag TRUE- 
> 

We even can put a character back into the buffer, if 
needed . 



ungetchr() /* un-get character */ 

T 

Puts a given character back into scanner buffer. 

> 

After initializing the scanner buffer, we can get as 
many characters from there as we want to. But parsers are 
higher level concepts, and they shouldn’t deal with the low 
level structures of scanning like getting three more 
characters or putting back one- Parsers mostly work on 
tokens- If we had a pure tokenized i rnp 1 ernent at i on, we could 
simply pop a token number from the scanner buffer- But here 
we need something to give tokens to the parser- Also white 
characters and comments should be ignored- 



IG 



in 



string tokens are given to the parser by the following 



rout i ne . 



matchtoken (st r, whtchk) /* match to a given string token */ 

char strCH, /* string token */ 

whtchk; /* boolean variable for white chr- check / 
■C 

This routine attemps to read the string token from 
scanner buffer- A following white character or delirnitev' is 
optional, and this decision is made by the caller, namely 
parser- It returns TRUE, if the token matches (and a white 
character, optionally), else ret urns FALSE- In case of 
FALSE, it backtracks in the scanner buffer to its previous 
place. 

> 

The following routine attempts to match a single cha- 
racter in the scanner buffer and returns a boolean result- 



match(chr) /* match to a single character */ 

char chr; character to match */ 

del wht ( ) ; 

/* if character matches, return TRUE */ 
i f (next ch==chr ) 

•C 

next ch=getchr ( ) ; 
ret urn ( TRUE) ; 

> 

ret urn ( FALSE ) ; 

> 



Both of these routines delete white characters first- 
And in case of FALSE, they do not backtrack to their' 
pr^evious places exactly, other-'wise the following r^outines 
have to skip white char^act er-'s one rnor-'e time. So, in the case 
of the FALSE or^ "unmatched" case, they backtr^ack to the ver*'y 
fir'st char^acter" which comes just after-' the white ones. 
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dolwht<) /* delete white characters */ 

Used by both mat ch-charact er and match-token routines and 
skips all the following white characters (blank, tab, 
carriage return and line feed characters) and the comments in 
the scanner buffer, 

> 

B- GRftMMftR 

Since there is not a standard C language grammar, we had 
to first write a grammar to parse. The Tiny-C subset was 
discussed in the previous chapter, and its complete grammar 
is presented in Pppendix ft. 

In this grammar (ftppendix ft), any terminal or non- 
terminal followed by a ’ character means "none or more," 
followed by a " + •' character means "one or more," and 
followed by a "?" character means "optional" or "none or 
one," Under these definitions for example: 



program : 

<pre-pracesaor> * <data-d©f ini t ion> * <f unct ion-def ini t ion) ■+• 
The non-terminal (program) goes to any number of <pre- 
precessor) , followed by any number of <data-def ini t ion) and 
followed by one or more <f unct ion-def ini t icVi) , 

The ’ 1’ character means "or". For example: 

pro-preceasor : 

"#define" <f i le-def ini t ion) I 
include'* <f i le-def init ion) 

Thus, < pre-precessor ) to "#define" followed 

by <f i le-def init ion) , or, "#include" followed by (file- 

def ini t ion) , 

The ’ !’ character means "allowed at most once, " For 
example : 
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switch-statement : 

'^switch" " (“ <ar ithmet ic-expression> “)" 

<case-stmt > " >" 

case-stmt : 

"case" I "default"! <const ant-expression) "a" 

(statement) * 

Thus, (switch-statement > can go to "default" at most 
once. 

C. PARSER 

A very simple form of a working parser is presented in 
Appendix B. It is a recursive descent parser but with a 
backtracking feature. There is a one-to-one correspondence 
between non— t errni na 1 names in the grammar and function 
names in the parser. The reader is encouraged to read the 
parser with an eye on the grammar. With the grammar’ s help, 
it is not difficult to understand the structure of the 
parser. 



In this first 


version of the 


Tiny-C 


parser. 


al 1 


f unct ions backtrack 


i f they fail. 


In the 


real Ti 


ny-C 


environment this is 


extreme 1 y unnecessary , 


because 


i n 


the Tiny-C grammar. 


ambi gui ty exists 


in a few 


places o 


nly. 


The reason this first version is pres 


ented in Appendix B 


i s 



its clarity and simplicity. In the following versions, 
unnecessary backtracks have been taken out. 

In all the routines in the parser, there are two 
backtracking tools. First, the "oldp" old pointer points to 
the parser’ s previous place in the scanner buffer, and 
second, the "line_no" line number keeps track of the current 
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line number for error checking purposes. If a function fails, 
these routines backtrack to their previous states and try to 
find another legal path to parse- 

Pppendix C has the routines for the basic nonterminals 
and terminals of the Tiny— C parser. So, it presents a 
working version of that parser with Appendix B. 

D. DATA STRUCTURES FOR THE PARSER 

Now is the time to introduce some data structuv^es to 
improve the Tiny-C parser- The first one is going to be a 
name string struct UY^e since all the following tables need 
this struct ure. 

1 . Name St r inq Implement at ion 

A name stv^ing is basically a big chav^acter^ ar^v^ay (or 
a stY'ing) which holds all the names used in the sour^ce file. 
Tiny-C has two routines to implement this struct uv^e: 

The fir^st one is used to add a new name into the name 
str^ing, and the second one is used to look for" a given name. 



Add^n«mo<) /* add a name into the name str^ing */ 

-C 

Adds a new name into the name str^ing fr^orn the 
"id_narne" global var'iable. The "id_narne" var^iable holds the 
cur'Y^ent identifieY^ name all the time. The function 

“ ident i f ieY'" in the paY^seY^ sets this vaY*‘iable wheneveY^ it 
paY'ses an identifieY". 

> 



find_n«mo() /* find a name in the name stY'ing 

-C 

Looks foY^ "id_narne" in the name stY'ing. If found, it 
loads the ident i f ieY^’ s addY^ess into a pointeY^ and Y^et UY^ns 

TRUE, else Y^etuY^ns FALSE. 

> 



£0 



In the current version of Tiny-C, the name string was 
implemented completely sequentially- Instead, there could 
have been a hashing mechanism, which would be much more 
efficient. When testing the whol-e compiler, it was observed 
that a large number of predefined constant names and 
variables was making execution slow. 

2- Constant Table 

Constants are implicitly declared elements. In 
the Tiny-C compiler, a constant table is implemented to take 
care of them. Since every occurence of a constant denotes 
the same declaration, we do not need to check if a constant 
occurs more than once- We simply add each constant into the 
constant table as it occurs- 

add_num(> /* add an integer number into constant table 

fidds an integer numeric value into the constant table 
if it is not in there. returns its address in a 

po inter - 

> 

In the current version of ftM, integers are the only 
numeric type. So it is the only numeric type inplemented in 
Tiny-C, and is the only constant denotation required. 

Since input data is an integer for the above routine, 
and since source file is read as character stream from the 
scanner buffer, we need a st r i ng-to-numer ic conversion 
routine, to convert text input into numeric values- 
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®tr_^num<> /* string to numeric 

<. 

Takes a string "nurn_narne" (numeric name) which is 

set by the "constant()" routine in the parser, ’ calculates 
its numeric value, and returns it in the "nurn^cnst" 

(numeric constant) global as an integer. 

> 



3- Def ini t ion Table 

In Tiny-C, the preprecessor command "#define“ lets 
us define constant identifiers- So, a definition table is 
implemented for these identifiers- 

In case of a "#define" declaration, we need to add 
a new constant identifier into the definition table- 



add_^cnid() /* add constant identifier */ 

< 

First, checks if the given id-narne is already in the 
definition table- If so it gives an error, since definition 
of the same const ant -id more than once is nonsense- Otherwise 
it nadds that given constant identifier into the definition 
t ab le- 
> 



The next problem in i mp 1 ernent i ng constant identifiers 
is finding the correspond i ng values for these constant id- 
names, if they are met when parsing a program. 



find_^cnid() /* find constant identifier 

I 

Takes a constant identifier name and looks for it 
in the definition table- If found, it sets a pointer to 
its place in the definition table and returns TRUE, else it 
ret urns FPLSE- 
> 



4- Scoping Rule 



In classical compilers, symbol tables are primarily 



responsible for establishing the scoping rules- 



The Tiny-C 



compiler solves the scoping problem in a different way. 

Our Tiny~C compiler has a variable string which holds 
all valid variable names in the current scope. When the 

parser starts parsing a new function or a new compound 
statement, (namely a new "block" in block structured language 
literature), the parser puts a mark into the variable string 
to define the beginning of the new block, and adds the 
following variable declarations into the same string. 
Whenever the parser goes out of a block, it deletes the 
very last block’s variables from this string. (Since the 
Tiny-C compiler is a one-pass compiler, the deletion of the 
variables for the last block is acceptable in this case). So, 
any time a variable is used^ the compiler looks for this 

variable in the variable string, starting from the end to 

the beginning- If found, it finds a pointer to the symbol 
table for this variable, if not, it gives an error message 
since that particular variable is unknown (or out of scope?). 

find_var<) /* find a variable in variable string */ 

-C 

Takes an id-name and looks for it in the variable 
string. If found, it sets a pointer to the symbol table 

pointing to its place in there and returns TRUE, otherwise 

it ret urns FALSE. 

> 

We introduced searching for variable names in the 
variable string before discussing inserting them. The reason 
is, whenever the parser meets a new variable declaration, 
it is supposed to add that new variable into both the symbol 



table and the variable string. In the Tiny-C compiler, one 
single routine does both these duties- Since the symbol 
table is not introduced yet, we didn’t meet this routine 
eit her- 

Here, the theory to satisfy scoping rule is: mark the 
beg’inning of a block in the variable string when starting 
to parse a new block, and delete the most recent block’s 
variables when exiting from it- So any variable which is not 
in the variable string is automatically out of scope- 

5- Symbol Table 

In the Tiny~C implementation, the symbol table is 
responsible for variables, function names, label names, and 
f unct ion arguments- 

Let’s first start with how to add a new variable into 
the symbol table when a variable declaration occurs- 



add_var() /* add variable */ 

Gets a new variable’ s id__narne and gets its type, then adds 
it into symbol table and variable string- 
> 



Similarly, label declarations require label names to 
be added into the symbol table, too- But we shouldn’t 
add labels into the variable string, since in ’ C’ they do 
not satisfy the same scoping rules as variables- 



Add_lab®l () /* add a label into symbol table 

< 

Gets a label, and adds it to the end of the symbol table. 

> 
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Whenever the parser meets a new label dec 1 arat i on , 



1 1 

adds this label into the symbol table by the above routine. 
But it must be smart enough not to accept duplicate label 
declarat ions. 

□ne pointer is assigned to point to the beginning of the 
very last function in the symbol table- So, when the 
parser meets a new label declaration, it first starts from 
the beginning of the last function in the symbol table, and 
goes all the way down to the end of it, to look for a 
same label name- If it finds one, it gives a duplicated 

label declaration error, since the same label is not allowed 
to be declared twice in the same routine in this language- 
The following routine does this job in the Tiny-C compiler- 

dup^lblC) /* is duplicate label? */ 

Checks if the same label name has been declared before- 

> 

6- Label Table 

In the C language, ^^ly label referenced by a goto 

statement has to be declared somewhere in the same function. 

Classically, compilers read the source file twice. But the 
number of input/output operations is very important for 

total execution speed- Since the Tiny-C compiler is 

designed as a "one-pass-compi ler " , we immediately have 

this problem: detection of undeclared labels- 

Classical two pass compilers read all label declarations 
in the first pass- So, in the second pass they can check if 



goto label" statements are valid. When our one-pass Tiny-C 



comp i 1 er 


meets 


a 


g o t o 


st at ement , and 


if 


the referenced 


label name 


has 


not 


been 


declared yet. 


it is 


unpred i ct ab 1 e 


if this 


label 


is 


going 


to be declared 


i n 


the following 



statements. To solve this problem, Tiny-C implements a 
label table, and at the end of every function, it 
checks if a referenced but undeclared label exists. 

Whenever a label is referenced by a* goto statement, 

the compiler saves it in the label table by the following 

rout i ne. 

save_lbl() /* save label into the label table 

-C 

Inserts a label which is referenced by a "goto" 

statement into the label table for future checking. 

> 

Pind at the end of every function, the compiler 

checks if the labels referenced by goto’ s were ever declar^ed 
in the function. 

ch«ck_lab»lo < ) /* check labels */ 

-C 

Called by the parser at the end of every function body. 
Checks if labels in the label table are declared in the 
symbol table. 

> 

7. Funct ion Calls 

Tiny-C keeps function names and their argument counts in 
the symbol table. In case of a f unct ion call, it checks 
if this function has been called before, and if it has not, 
enters its name and argument count into the symbol table. 



£6 



If 



has 



been entered before, 



it checks 



the 



i t 

argument count in the new 
the one in the symbol table- 
not the same, it gives an 



i f 

function call is the same as 
If the argument counts are 
" inconsistent argument count " 



error. 



add_f un ( f un_no) /* add function into symbol table */ 

char *f un_no ; 

< 

Adds a function name and its argument count into the 
symbol table in case of a function call, and if it is the 
first call of the function- If it is not the first call, the 
function is already in the symbol table, so, it checks if 
argument counts match- In both cases, it returns the 
function’s function number (basically symbol table entry 
number) to the parser, to emit intermediate code- 
> 



Q- Funct ion Declarat ions 



In the C language, parameter declarations follow a 
function declaration- Parameter names have to be given 
inside parentheses immediately following a function nc^rne, 
and then they have to be declared one rnov^e time with their 
t y pes- 

The following parameter declarations have to match the 
ones given with function name. Tiny-C has two routines to 
get this mechanism to work properly- 



chk^prrnt() /* check parameter */ 

< 

At the end of a parameter declaration, this routine 
checks if that parameter was given as one of the function’s 
arguments, or if it is declared more than once! If 

everything is proper, it enters the parameters’ type into the 
symbol table, since the parameter name was already entered 
before (when parsing the parameter list following the 
f unc 1 1 on name ) - 
> 
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find, at the end 
has to make sure 
function name were 



of all parameter dec 1 arat i ons, compiler 

that all the arguments given with 
declared as parameters. 



chk_parms() /* check all the parameters */ 

-C 

When parameter declarations are done, checks if there is 
any parameter name in the symbol table, without its type. 
Since parameter names are entered into the symbol table when 
parsing the parameter list, and types are entered in ■ there 
when parsing the following parameter dec 1 arat i ons, if there 
is any parameter with its type missing, that means it is not 
dec lared- 
> 



These are all the data structures, used by the parser to 
manage variables, constants, labels, function names and 
arguments, and all remaining structures in the Tiny-C parsev^. 
The following section improves the parser one more step, and 
handles the error checking mechanism- 



E. ERROR CHECKING 

P list of error and warning messages used in the Tiny—C 
compiler is given in Pppendix D. 

Error and warning messages are given by the 
f o 1 1 ow i n g r o u t i nes : 



©rr^msg (mffig^no ) /* error messages */ 

char rnsg_no; 

/* increment error counter */ 

++err_cnt ; 

/* give line number of the error */ 

printf (""/-d error! “, line_no) ; 
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/* and give the error message 



M-/ 

switch (rnsg_rio) 

-C 

case list for all error messages described in ftppendix D- 

> 



warn i ng ( msg^no > /* warning messages 

char rnsg_no ; 

-C 

give line number of the warning */ 

printf ("%d warning! ", line_no) ; 

/* give the message */ 

switch (rnsg_no) 

< 

case list for all warning messages described in Appendix D. 
> 

> 



F- INTERMEDIATE CODE GENERATION 

In order to generate code for the target machine, first 
the compiler has to build a parse tree. Appendix E is a 
list of nodes that form Tiny-C parse trees. 

Now, the same old heavy-duty parser can s-houlder one more 
job: emissions of intermediate code. 

The following routine does the intermediate code 
emissions, when called by the parser. It takes two arguments; 
the node itself, and the number of the children of this node. 
If there is not any error up to that time, the parser emits 
the code into an emission table, (which is in fact a 
flattened parse tree) and increments the emit -counter. 



emit (node, chi Id) emit intermediate code */ 

char node, /* node kind to emit *•/ 

child; /* # of the children belonging to this node-^^/ 
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< 

/* if there is not any error, give emissions */ 

if ( ! err_cnt ) 

erni t str Cerni t _cnt U = node; 
erni tch 1 Cerni t _cnt II = child; 

4*+ern i t _cnt ; 



G. POSTPONED EMISSIONS 

They^e ay^e times when we do not want to emit code in the 
same oy'dey' as we pay'se. ftn assignment statement is a good 
example foy^ this situation- 
Suppose we have the assignment: 

jo® ® Jimmy ^ 5; 

The pay'se ty'ee foy^ this statement is: 

assi gnrnent 



vay'iable mult ipl icat ion 

joe vay'iable constant 

jimmy 5 

Since ouy^ pay'se ty^ee is in flattened foy^rn, the oy'dey' of 
the int ey'rnediate code emissions foy' the above ty^ee, should 
be : 

jimmy, vay^iablo, 5, constant, multiplication, jo®, vay^iable, 
asai gnrnent 

But this is not the same oy^dey*' we pay'se ! They^e may be 
some quick solutions foy' this pay'ticulay^ py'oblern- But the 
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case might be worse than the above one. Consider the 
fol lowing statement : 

JoeC (J irnrny*15) “/• mary++ ] joeCSD; 

Here, the left value is not a simple variable. It is an 

array element with a complex index expression. 

Summarizing, there are cases, when we simply do not want 
to give emissions immediately- We want to save them, and then 

at the end of some certain expressions we want to emit them. 

This type of emission is called “postponed emission." 

Up to now, our recursive descent parser has been 
suffering the same problem. But for the sake of simplicity, 
we ignor^ed it. Now is the time to build some mechanisms to 
make the parser be able to postpone emisSions- 

First of all, we have to make our emission tool more? 
flexible. The following is revised version of our “emit-code" 
f unct ion. 

emit (node, child) 

int node, 

child; 

/* if there are not any errors, give emissions */ 

if ( ! err_cnt ) 

■C 

* (emitptrC0Il + (*( ernitptrCE’Il ))) = node; 

* (emitptrCll + (-^( emitptrCE:] ))) = child; 

++ (* (emit ptr C2U ) ) ; 

> 

> 

Hs can be seen, this revised version is not restricted to 
emit code into emission table all the time. It can emit code 
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into any table which is addres.sed by “ernitptr" pointers. That 
is, by setting these pointers somewhere else, we can 
"redirect” the emissions- 

The following routine directs emissions into a given 
pointer set. This given pointer set is supposed to be 
pointing to a table, of the same type as the emission table. 



drct^ernit (ernit^ptr, ptrl, ptrS, ptr3) /* direct emits «■/ 
int *em i t __pt r CD , /* pointer set to emissions -^/ 

ptrlCD , 

ptr£CD , 

”^^ptr3 ; /* pointers to new direction ♦/ 

< 

erni t_ptr C0D =ptr 1 ; 
em i t _pt r C 1 D =pt rS ; 
em i t _pt r CE'D =pt r3 ; 

> 

fis we have seen before, our emit—code routine emits 
into a table, pointed to by the "ernitptr" global emission 

pointers. But, if we redirect these pointers into somewhere 

else, don’t we lose the address of the previous table? So, we 
have to be able to save our previous emission addresses 
somewhere. The following rout ine saves these pointev^ 
addresses in given ones. 



rplc^emi t © ( ptr_a, ptr_b) /* saving emit pointers */ 

int *ptr__aCD, 

*ptr__bCD; /* pointer sets to both emit-tcibles */ 

C 

pt r_a C 0 D =pt r_b C 0 D ; 
pt r_a C 1 D = pt r__b C 1 D ; 
pt r_a C£D =pt r__b CciD 5 

> 

find the very last problem: We are able to redirect our 



ernitptr" emission pointers into some tables (then obviously 



successive emissions are then entered into these tables). We 
are able to save the previous value of these pointers. E<ut 
what about the "postponed emissions". Namely the ones we 
saved somewhere else other than our emission table. The 
follo«wing routine transfers previously saved emissions fv^orn 
one table into another. 

trns^emits <emi t_a, emit^b) /* transfer emits 

int *ernit_aCIl, /* destination table pointers */ 

*emit_bCH; /* source table pointers 

< 

char i ; 

for (i=0; i< ( * ( ern i t _b ) ) ; +4-i) 

*( ernit__aC0Il + ( * ( ern i t _a CclH ) ) ) = * ( em i t _b C0 H -h i ) ^ 

emit_aClH ■+• ( ( ern i t _a C2H ) ) ) = ^ ( ern i t _b C 1 H -i- i ) ; 

++ ( * ( ern i t _a C£Ii ) ) ; 

> 

> 

H. CODE OPTIMIZftTION 

Under normal conditions, code optimization can be done 
on both intermediate code and target code. When generating 

target code, compilers attempt to find the best code 
generat ion sequence, eliminate common sub-ex press i ons, 

minimize the number of temporary variables. find after 
code generation is done, they pass through it again one or 

two times, for peep-hole opt i rn i zat i on. Jump opt i m i zat i on, 

etc. 

Our Tiny-C intermediate code has a flattened tree 
structure", it is possible to traverse it as a tree. In order 
to d c« this. 



we will need' some interface routines between 



this f 1 at t ened . f orm and a real tree structure. Then we can 
logically look at it as a tree and travel from root to leaves 
or vice-versa. 

In this thesis work, it was decided to gene*rate code as 
quickly and simply as possible. So the Tiny-C compiler uses 
sequential code generation, even though it is not the best 
way to do it. 

Since our code is going to be source code for the AM 
assembler, it is not going to be easy to work on a "text” 
file, to optimize it- At this point, we can work on our 
int errned iat e code to make it more effective- So, contrary to 
the classical compilers, our code optimization is going to be 
only on intermediate code, instead of both intermediate and 
target codes. 

There are several things we do in the code optimization 
phase : 

— Removing dead code 

— Label /.jump optimization 

— Emitting imbedded assignments 

The last one cannot be classified as part of code 
optimization phase, although we deliberately left it to this 
point. We will see why pretty soon- 

1 . Dead Code Elimination 

In some cases, the Tiny-C compiler generates dead- 
code. For instance: 
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In the interrned iat e code list, there is a node, 
called "DUMMY". Sometimes our parser may emit some code, but 
then it may realize that this code is not necessary. In that 
case emitting a "DUMMY" node makes this previous code "out 
of concern" or a "dummy statement "- 

In fact, such a tool is not truly necessary, but was used 
in early versions of the compiler. In the following phases 
this "DUMMY" node was used only in the "case" statement. Due 
to constraints on time, it has not been removed. 

fts we discussed before, this thesis is a presentation of 
the first version of the Tiny-C compiler, and hopefully a 
reference for its future authors, rather than a discussion 
about compiler writing techniques. 

Nevertheless, to simplify the tree we can remove this 
"DUMMY" node and its children. 

In addition, there may be dead-code that is generated by 

the compiler, ftn example: 

Jo» - S| 
goto thertii 
Joe « Jimmy *5 I 
++J immy j 

therei 

Here two statements, in the third and fourth lines are 
dead code- They will never be used. So, we can remove this 
dead code from the parse tree. 

£. Dead Label Elimination 

In general, any label declaration is automatically the 
beginning of a new basic block. However if there is no 






"goto" for this label, then such a label is part of a larger 
basic block. 

Having basic blocks as large as possible removes the 
amount of data transfer between registers and memory- In 
other words it reduces the number of "register cleaning" 
o per at ions. 

So, if we detect labels, which are declared but never 
used, removing them is going to be an improvement - 

3. Temporary Variables in the Front End 

There is one more thing that has to be done when 
passing over the int errned i at e code for optimizing purposes. 

In the parser, arithmetic expressions following a 

"switch" reserved word are assigned to some temporary 

variables. These temporary variables are represented by 

"TVAR" nodes, with a temporary variable number. Since the 
result of those arithmetic expressions are assigned to 
"TVftR" nodes, and these variables are compared with "case" 
labels, we have to allocate memory for these nodes just as 
we are going to do for normal variables. The values of 
"TVftR" nodes may or may not reside in their allocated memory 
locations, they may be kept in registers, too. The register 
manager in the following section will treat them just 

like variable nodes. 

In fact, all variables are referred to by their 

symbol numbers, or their symbol table entry numbers. And at 
this point, we know our symbol table length. So we can assign 
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some new symbol numbers to these "TVAR" nodes, and change 
their names to “VARB" variable nodes. Then the register 
manager can take care of the rest. 

4. Code Optimization. Phase 1. 

The following routine is the first part of the 
interfiled iat e code optimization, and is called just after the 
parser . 

frmtopt() /* first pass of optimization */ 

-C 

- Detects dead-code and replaces it with "NOOP" no 
operat ion nodes. 

- Detects unused labels and replaces them with "NOOP" 
nodes. 

- Replaces "TVAR" nodes with "VARB" nodes and assigns 
them new symbol numbers starting from the last symbol number 
in symbol table. 

> 

5. Separation of Front End and Code Generator 

Up to now, our intermediate code has been in memory, 
in its allocated location (emission table). The emission 
table has to be large enough to be able to keep the largest 
size program in it, because of its fixed size- If the input 
source file is too big to fit into our emission table, Tiny-C 
responds with an error message. (This is one of the reasons 
it is called Tiny-C). 

It is possible to pass this emission table to the second, 
target machine dependent part of compiler, but it would 
not be efficient - 
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There is a logical separation between parser/ 
intermediate code generator and target code generator. 
The first part is totally language dependent and machine 
independent, and the second part is machine dependent but 
language independent. So, putting a physical separation 
between these logically independent units is always a good 
idea, and has been implemented in most compilers. 

For this reason we should end the first part of 
this compiler here. But before doing this, we have to pass 
the outcome of this part to the second part of compiler 
(basically, the code generator of Tiny-C). 

The code generator is going to need intermediate code, 
a symbol table, a constant table, and the number of 
temporary variables used by the parser. fill this information 
has to be written in some place for later access by the code 
generat or. 

But we have a last minute problem here, which we 
deliberately ignored up to now. This is "imbedded 
assi gnrnent s. " 

6. Imbedded ftssi qnment s 

In the C language the statement ; 

Jo» ■ Jimmy+-»- * 5| 

is in fact two different statements: 



Jo» ■ Jimmy * 5| and a following: 
++J immy j 
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The second statement here is an "imbedded assignment." We 
didn’t emit code for imbedded assignments up to now, and in 
fact we have ignored this problem on purpose. Because right 
now, when writing intermediate code into a quad file, we 
can simply emit these codes without any effort - 

7- Code Optimization. Phase £. The Quad File Filter 

The following routine is the second part of the 
intermediate code optimizer. It is called just after the 
first -pass opt i mi zer. 

mcndopt() /* second-pass optimization */ 

Creates a quad file named "TC.QQQ" and: 

- Writes intermediate code in this file, without 

"NOOP" codes and with additional imbedded assignments. 

- Marks end of intermediate code 

- Writes symbol table 

- Writes number of the temporary variables (TVPRs) 

- Writes constant table 

- Writes name string 

- find closes that quad file. 

> 

I. DftTft STRUCTURES FOR CODE GENERATION 

The final step is code generation for the Abstract 
Mach ine. 

As discussed before, the output of this compiler is not 
going to be binary code which is ready to be linked and 
run. It is going to be a source file for the AM assembler, 

so it will be readable. 
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Since this is the second part of the compiler, it 

receives the work done in the first part. The following 

routine reads a Tiny-C quad file from the disk. 

r#Ad^qu#d() /* read quad file */ 

< 

Reads quad file from disk in a sequence of interrned iat e 
code, symbol table, constant table and name string. 

> 

Now the compiler has all the information it needs to go 
ahead and generate code. But right now it does not have any 
tools to do this. We build some tools first, to help the 
code generation phase. 

The target machine PM theoretically has an unlimited 
number of registers. This is not realistic. So, the Tiny-C 

compiler considers . that PM has a reasonable number of 

registers, and tries to manage them properly- 

Keeping all the variables and all the intermediate 
results in registers would be awfully nice. But since this is 
impossible and we are going to run out of registers after 
generating a piece of code, we will need a "register 
manager" to handle the limited number of registers 

properly. Tiny-C compiler does not have a single "register 
manager" routine. Instead, we will introduce a couple of 

routines, which manage PM registers properly. 

1 . Pddress Descr i ptors 

Ps it is known, a compiler cannot keep all the 
variables in registers all the time. So, it is obvious 
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that a variable may be in a register, or in its 
allocated memory location, or both, at a particular time, 
ft compiler needs a mechanism to keep track of the current 
addresses of all variables- The following routine sets symbol 
addresses by given pararneters- 



Addr^dmcr (»ym^nO| mt«t utt, r«g_no) /* symbol addr. descY'i pt or*/ 
char syrn_no, /* symbol number */ 

status, /* address status */ 

reg_no; /* register^ number */ 

< 

Sets current addresses of variables. ftll variables 
have an 8-bit value address descriptor. Status may be 
" in-reg i st er " , ” in-rnernory " or "in-both". If 7th bit of 

this descriptor is 1, that means variable is in its 
allocated memory location- If the value stored in bits 0 to 
6 is zero, means variable is not in any register. If it it 
different from zero, that value minus one gives the register 
number which symbol is stored in. 

> 



Exactly the same problem exists for constants- Even 
though constant values are fixed and they reside in a 
constant table all the time, the compiler should not transfer 
a constant value into a register if it is already in one. 
The following routine sets a constant address descriptor- 



cn«t_Jtdr_d»cr <cn«t_no, status, reg_no) 



i nt 


cnst_no; 


/* constant 


number */ 


char 


st at us. 


/* status 


*/ 


r 


reg_no ; 


/* register 


number */ 




Sets current 


addresses of constants. 


ftll constants have 



an 8-bit value address descriptor. If 7th bit of this value 
is 1, and all others are zero, that means the constant is 

not in any register^. Otherwise the value of this descriptor 
gives the register number which the constant resides in- 
> 
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Temporary Management 



There is one more address problem- When 

calculating an arithmetic expression, we may have a couple 
of temporary results. For instance: 

immy # 5 jam ♦ 3j" 

The statement has the following parse tree: 

assi gnment 



variable 



add i t ion 



j oe mu Itiplicati on 



mu 1 t i p 1 i cat i on 



variable 



constant variable constant 



j irnrny 



joe 



Here, the compiler calculates "jimmy * 5" and "joe * 3" 
first. Since it has to keep these results somewhere 
ternporar i ly, we have to manage these temporaries and 
keep track of their addresses. 

Tiny-C compiler manages temporaries’ addresses exactly in 
the same way as it does for variables. In addition, it may 
dispose a temporary, so we can use the same temporary number 
somewhere else later. 



dispo»«_t»mp (t«mp_no) /* dispose temporary */ 
char temp__no; 

< 

Disposes the given temporary variable. 

> 
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When the code generator finishes a statement completely, 



there is no need for any temporary, in Tiny— C’s sequential 
code generation order- So at the end of every statement, 
the compiler disposes of temporary variables- 



cl««n^t •mp« < ) /* clean all temporaries */ 

-C 

Disposes all the temporaries. 

> 



Compiler needs a new temporary every time it calculates a 
temporary result. So, the following routines provide new 
temporaries to the code generator. 



< t«mp_no) /* get a temporary variable */ 

char *temp_no; 

-C 

Finds an unused temporary, returns its number to the code 
generator, and marks it "used." 

> 



3. Finding Current Addresses 



The compiler 


should be 


able to 


figure out 


any 


given 


token’ s address at 


any time- 


T iny-C 


uses the 


fol 


lowing 



routines for this purpose- 



i*_lnr«g (tok®n_no, kind) /* is token in a register? */ 

int token_no; /* token number */ 

char kind; /* token kind */ 

-C 

Takes token kind (variable, constant or a temporary 
variable) and its token number, returns TRUE if it is stored 
in a register, else returns FftLSE- 
> 

If a particular token is in a register, it can be 



figured out which register this one 



IS. 



g»t _rog_num (tok®n_no, r®g, kind) /* 

int token_no; 

char *reg, 

kind; 

•€ 

Takes a token number and its 
register number in "reg" pointer, 

> 



get 


reg ister number 




/* 


token number 


*/ 


/* 


reg i st er number 




/* 


token kind 


*/ 


ki 


nd, and returns 


its 



After some operations, 



variable values may be only 



in registers, 
locations. The 
is in memory. 



and may not be in their 
compiler should figure out 
to avoid transferring it 



al located 
if a given 
into its 



memory 
var ible 
memory 



locat ion unnecessar i ly. 



(«ym_no> 

char syrn_no; 

■C 

Checks variable’s address descriptor, returns TRUE if it 
is in memory, else returns FALSE - 
> 



/* is symbol in memory? */ 

/* variable’s symbol number */ ’ 



4. Reg i st er Management 



A register can hold just one single value. But in 
the Tiny-C compiler, this value can belong to more than 
one token at the same time; for instance the same register 
can keep two variables, one constant and two temporary 
variables in it if they all have the same value at that 
part icul ar t irne. 

Ule will define the structure of the register manager 
like this: 



#define MXREG 16 
#define MXVAR 5 



int *regtr CMXREGD , 

reg^arr CMXREG*MXVAR3 ; 



/* # of target machine’s registers */ 
/* maximum # of variables that 

one single register can hold */ 
/* pointers to register variables */ 
/* register variable array */ 
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So, every register has an amount MXVPIR of register 
array (reg_arr) locations. These are token descriptors and 
shows which tokens (variables, constants and temporaries) 
that particular register has at any time. The size of the 
register array is MXREG times MXVAR. 

The register array keeps the names of the tokens which 
are loaded in some registers. 

Since particular parts of the register array belong to 
particular registers, we can easily figure out which tokens 
are in which registers, or which register has which tokens. 

In order to calculate a new result, the compiler has to 
find an unused register to load the value. The following 
routine provides free registers to the code generator. 



g«t _#_r»g (r»g ) /* get a register */ 

char *reg ; /* register number */ 

< 

Checks every registers register array locations. If finds 
a blank one, returns this register to code generator- If they 
are all occupied, evacuates one of them randomly, and returns 
it - 
> 

The compiler should be able to load a token from its 
memory location into one of the registers- The following 
routine is used for this purpose. 



lo«d_in_reg <tok«n_no, r»g, kind) 


/* 


load into a register 




int token_no; 


/* 


token number 


*/ 


char *reg. 


/* 


reg i st er number 


*/ 


k i nd ; 


/* 


token kind 


*/ 



< 



Takes a register, a token number and its kind, and 
generates code to load it into that given register. 

> 
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After 



loading this token into a 



reg ister, 



its 



address 



descriptor has to be set as "in both register and memory", 
and the register manager should set the members of this 
part icular reg ist er. 

Suppose we load an integer value "3" into a 

register. The register manager should know that the register 

is keeping a constant value "3", or which constant number 

from our constant table is in that reg ister - 

Then, suppose we assign this constant to a variable, 

like in the statement: " joe=3. " 

The register manager should mark that this particular 
register has a constant and a variable in it. 

The following routine helps the register manager to state 
that a register is now holding a given token. 



occupy_r®g <tok«n_no, reg, kind) 


/* 


occupy reg ist er 


*/ 


int token_no; 


/* 


token number 


*/ 


char *reg. 


/* 


reg ister number 


*/ 


kind ; 


/* 


token kind 


*/ 



Enters given token into given register’s register array 
location, to mark that this register is holding that given 
token in it. 

> 



There may be times when the compiler assigns a new 
value to a variable but that particular variable may 
have been stored in a different register before. Since we 
want to bind a new register to the old variable, we want to 
release its old register. 



/* release symbols’ s register / 



r«l^»yrn^r®Q (aym^no) 

int sym_no ; 

<. 

Takes a variable, finds its register, and deletes its 
membership to this register. 

> 



Sometimes the compiler has to store a token from its 
register into its memory location. The following two routines 
do this chore. 



•v«_symbol («ym_nO| r®Q^no) /* evacuate register from symbol 
char sym_no, reg__no5 
■C 

Generates code to transfer symbol from register into 
memory. Then sets the symbol’s address descriptor as "in 
memory" only. 

> 



®va^t®mp < t®mp^no, reg^no) /* take temporary out of register*/ 
char temp_no, reg_no; 

Generates code to transfer temporary from register into 
memory. Then sets its address descriptor as "in memory" only. 
> 



Pnd there are some cases when compiler wants to empty a 
register completely. For instance, we may do this to release 
a register. 



®v«_reg (r®g_no> /* evacuate register */ 

char reg_no; 

< 

Takes a register number, finds all its members in the 
register array, and generates code to transfer those members 
to their memory locations if they are not already there. 
(Uses above two routines, actually). 

> 



Before getting out of a basic block, the compiler should 
empty all registers. The following routine does this task. 
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/* clean registers 



cl®an_r»go ( ) 

< 

Calls "evacuate register" 

> 



routine for all 



*/ 

reg ist ers- 



5. Operands for the Operators 

In the actual code generation phase, the compiler 
looks for an operator, and according to operator’ s type, 
requests the registers for operands. The following two 
routines return integer operands in registers- 



load_two_oprnd < j , r 1, r£, step) 

i nt j ; 

char *rl, *r£, 

*st ep ; 

•C 



/* 


load 


two 


i nt eger operand 


*/ 


/* 


point er 


to int- code 


*/ 


/* 


reg ister 


s 


*/ 


/* 


# of 


the 


total steps taken*/ 



Gets two operands from intermediate code, loads them 
into two available registers, and returns these register 
numbers to the code generator- Since our parse tree is in 
a flattened form, the code generator needs to know where it 
came in that array-tree, after loading these operands- So, 
the "step" is a variable that tells how many steps have 
been consumed in the interrned i at e code- 



> 



XoAd_ono_oprnd < 1, reg, step) 

int i ; 

char *reg, 

*step ; 

•C 



/* load one integer operand */ 
/* pointer to int- code */ 
/* register number */ 
/* # of the total steps taken*/ 



Loads the next operand in the parse tree into a 
register, and returns the register number with the number of 
steps walked in the parse tree- 
> 



The Abstract Machine AM, has some boolean operators 
that accept only boolean operands. But everything in Tiny— C 
has integer type. So the code generator should have some 
tools to convert integer values into booleans- The 



48 



following two routines provide boolean operands for boolean 
operators, whenever they are needed. 



two_bool <J, rl , rS, stap) 


/* 


load two boolean operand 


*/ 


int j; 


/* 


pointer to int. code 


*/ 


char *rl, *r2, 


/* 


reg isters 


*/ 


♦step ; 


/* 


# of the total steps taken*/ 



Loads two operands. If they have integer values it loads 
the correspond i ng boolean values into registers and returns 
them to the code generator, 

> 



one^bool ( i , rog_no, atep) 

int i ; 

char *reg_no, 

*st ep ; 

-C 

Returns one boolean 

> 



/* load one boolean oprnd */ 
/* pointer to int. code */ 
/* register number */ 
/* # of the total steps taken*/ 

operand into a register. 



J. CODE GENERftTIDN 

In the Tiny~C compiler, the main routine in the code 
generator is a large switch statement as is used in most 
compilers. The compiler generates code for the data segment 
first, which is Just a memory allocation routine for the 
symbols. Then the code segment comes as the actual code 
generation phase. The following roat ine is a subset of the 



code generation routine for the code segment. Each case 
element dispatches to the code emitter for that case. 



*/ 

*/ 
*/ 

/* walk emit array from beginning to ernit-end */ 

for (i=0; i< emit end 5 ++i) 



cod»_meQ ( ) 

-C 

int 

char 



1 ; 

r 1, r2, 



/* give code segment 

/* index variable 
/* register numbers 
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/* if node has children, (if it is not a leaf) */ 
if (erni tch 1 C i H ! =0 ) 

switch ( ernitstrCiD ) 



case 


I ODD 


: /* integer addition */ 

code_iadd ( i , &st ep) ; 


. 




break ; 


case 


MEND 


: /* end of main function */ 

fprintf (fl, " stop\n") ; 

break ; 



> 

> 



The following routine is used by the above "code_seg()" 
routine and emits code for integer additions- 



code_iadd ( i , step) 


/* 


int eger add it ion 




int i ; 








char *step; 


/* 


of the steps taken on int- 


code */ 


< 

char rl,r£:. 


/* 


reg ist er numbers 




t emp_no ; 


/* 


t e rii p o r' a y variable n u rn b e 




/* load two operands 




*/ 





load_t wo_oprnd ( i-1 , &r 1 , &rc:, step) ; 

/* they both might be in the same register, 

if so, allocate one more register */ 

if (rl=~rc!) 

-C 

get_a_reg (&r 1 ) ; 

f pr int f ( f 1 , " mov r (0 : %d ) , r (0 : %d ) \n" , r2, r 1 ) ; 

> 

/* since addition will be loaded in rl, evacuate it first */ 
eva_reg (r 1 ) ; 

/* code for integer addition */ 

fprintf (f 1, " add r (0:"/.d) , r (0:%d) \n", r2, rl) ; 
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/* give a narnber' to this temporary result */ 

get _a_t ernp ( &t ernp_no) ; 
occupy _reg (ternp_no, rl, TEMP) ; 

/* set temporary’ s address descriptor */ 
t emp_var Ct emp_noJ =r 1+1 ; 

/* validate emission array for sequential code generation */ 
emitstrCi J=TEMP; 
emit chi Ci]=*step+1 5 
emit st r C i -1 H =t emp_no ; 



Some sample C programs and the code generated for them by 
the Tiny-C compiler can be found in Pppendix F- 
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V. CONCLUSION 



Precise, underst andab le and enforcable interface 
standards can provide a way to improve efforts toward 
portable software. In the Tiny~C i rnp 1 ernent at i on we showed a 
way to improve the prograrnrni ng capabilities of PM, and encou- 
raged programmers to use such a portable and st andard i sabl e 



machine in high level languages. 



Unfort unat ely, this implement at ion is not completely 

satisfactory- Because of restricted capabilities in the 
target PM machine, the Tiny-C compiler does not fully 

support application prograrnrn ing . Some of these restrictions 
are : 



- Bcised on the principle of resource abstraction, PM has 
strictly defined data types. ^ Since it presently does not 
support conversion between two types, it is a higher 
level concept than the "C" language. So, contrary to 

usual implement at ions, this thesis had an opposite 
direction: production of a lower level tool in a higher 

1 eve 1 en V i ronrnent . 



- The PM 
1 i nker . 
program 
which is 
merit s- 



abstract machine does not yet have a 
So, the user is forced to keep th 
and input /output library in one single 
extremely inconvenient in application 



complete 
e whole 
module, 
environ- 



The current version of PM 
hardware. Even though this 
merit phase, it is not going 
for users - 



is an emulator, rather than 
is convenient for a devel op- 
to be an easy-to-use product 



So, further development that could be done for an 



improved PM environment might include: 



- P linker for PM 



- Type conversion between PM data types 



An input /out put 1 i brary for the Tiny-C compiler 



A Tiny~C code generator for AM machine code (instead of 
a source generator for the AM Asembler) 

Given improvement s in AM, an extended version of the 

Tiny-C compiler to cover the whole Tiny-C language 
grammar 

A compiler version of AM. 



APPENDIX ft 



GRAMMAR FOR TINY-C LANGUAGE 



PROGRAM ; 
program: 

<pre-precessor> * <dat a—def i n i t ion> * 
<f unct ion-def ini t ion> + 



PRE-PRECESSQR : 
pre-precessor : 

"#def ine" <f i le~def init ion> 

"#include" <f i le~def init ion) 

f i le-def ini t ion : 

’ (filename) I 

’ (filename) ’)’ 

f i lename : 

( ident i f ier) (f i letype) 

f i letype : 

’ . ’ ( ident i f ier) 



DATA DEFINITIONS : 

dat a— def ini t ion : 

(sc-speci f ier) ? (declarat ion) 

sc-speci f ier : 

■•auto" I 

"static" I 

"extern" I 

"reg ister " 

declarat ion : 

(type-speci f ier) (var iabl e-dec lar at ion-1 ist ) 

t ype-speci f ier : 

"char" I 

"short" I 

"int" I 

"long" I 

"unsigned" I 

"float" I 

"double" 



54 



van able-declar at ion-1 ist : 

<var i abl e-dec 1 ar at i on> <rnore-var i abl e-dec 1 arat i ons> * 

Diore-variabl e-dec larat ions : 

’ - ’ <var i abl e-dec 1 arat ion> 



DECLARATIONS ; 

var i abl e-dec 1 arat ion : 

'* * " ? < i dent i f i er> < i ndex-dec 1 arat i on> ? <initiali zer> ? 

index-declarat ion : 

"C" <const ant -expression) (1) " U " 

initiali zer : 

" = " <pr irnary) 

pr irnary : 

< i dent i f i er> I 

<constant> I 

<char-def ini t ion) I 

<string) 

char-def ini t ion : 

<character) 

st r i ng : 

<character)* “ 



FUNCTION DEFINITION ; 
f anct i on— def i n i t i on : 

<type-speci f ier) ? <f unct ion-declarat ion) <funct ion-body) 

funct ion-declarat ion : 

< identifier) < i dent i f i er- 1 i st ) ? ")" 

ident i f ier-1 ist : 

<ident i f ier) <rnore-ident i Tiers) * 

more- ident i Tiers : 

’ , ’ <ident i Tier) 

Tanct ion- body : 

<type-decl-l ist ) 



< c c< rn p o Li n d s t a t e m e r 1 1 ) 



PARAMETER DECLftRftTIQNS ; 



type-declarat ion-1 ist : 

< parameter-dec 1 ar at ion> -h 

pararneter-declarat ion : 

<type-speci f ier> <pararneter-declarat ion-1 ist > 

pararneter-declarat ion-1 ist : 

< pararnet er > <rnore-pararnet ers> 

more- parameters : 

’ , ’ (parameter) 

parameter: 

’ ? < ident i f ier > < index-decl arat i on) ? 



STATEMENTS ; 
statement : 

<cornpound-st at ement ) 1 

< f unct ion-cal 1 ) 1 

(assignment-statement) “ ; " I 
( i f-st at ernent ) I 

(wh i le-st at ement ) I 

^ (do-statement) I 

( f or-st at ement ) I 

(switch -statement ) I 

(break-st aternent ) I 

"continue" I 

(ret urn-st at ement ) 1 

(goto-st at ement ) I 

(label) I 

II M II 
1 

compoLind-stat ernent : 

(declarat ion) * (st at ement ) 4- 

f unct ion-cal 1 : 

(identifier) ’ (’ (ex press i on- 1 ist ) 

ex press i on- 1 ist : 

(expression) (more-expressions) * 

more-expressions : 

’ , ’ (expression) 

assignment-statement : 

(assignment) I 

( i ncrernent a 1 -express i on) 
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ass i gnrnent : 

< 1 val ue> 
< 1 val ue> 
< 1 val ue> 



" = " < log i c- ex press ion > 

<sh i ft-assi gnmerit-op> <sh i f t ^expression) 

< bi t wi se — ass i gnment —op) < bi t wi se— ex press ion) 



sh i ft -ass i gnrnent-op : 

"+=" I I I "/=" } I "))^" I " 

b i t wi se— ass i gnrnent —op : 



i ncrernent a 1 —express i on : 

'*++" < lvalue) I 

" — " <lvalue) I 

(lvalue) "++" I 

(lvalue) 

i f-st at ement : 

"if" "(" (logic-expression) ")" (statement) 
(else-st aternent ) ? 

e 1 se—st at ement : 

"else" (st at ement ) 

wh i 1 e-st at ement : 

"while" "(" ( 1 og i c-ex pressi on) ")" (statement) 

do-st at ement : 

"do" (statement) "while" ’ (’ ( 1 og i c-ex press i on) 

II • II 
9 



for-st at ement : 

"for" "(" (ass i gnment- 1 i st ) ? ";" (logic-expression) 

";" (ass i gnment - 1 i st ) ? ")" (statement) 

assi gnment-1 ist : 

(ass i gnment -st at ement ) (more-ass i gnment s) * 

more— ass i gnment s : 

’ , ’ (assi gnment-st aternent ) 

switch-statement : 

"switch" "(" (ar i thmet ic-expression) ")" "<" 

(case-strut ) + " >" 

case-st rnt : 

"case" I "default"! (const ant -ex press i on) 

(statement ) 

break-st aternent : 

"break" 
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ret arn-stat ement : 

" ret Lir n “ <ex press i on> 

got o-st at ernent : 

" goto*' < ident i f ier> 

1 abel : 

< ident i f ier> “ : " 



EXPRESSIONS ; 



expression : 

<string> I 

< po i nt er -e x press i on > I 

<address-expressi on> I 

< log ic—expressiori> I 



< i ncrernent a 1 -expression) 

po i n t er-e x press i on : 

(array-element > 

< ident i f ier> 

'*(" <arith-expr> ")" 

address-expression : 

"&♦" (array-element > I 

"&•" (ident if ier> 

log ic-expression : 

( logic-term) (more- log ic-t erms) * 

more- log ic-t erms : 

"11“ ( log ic_t errn) 

log ic-t erm : 

( logic-fact or) (more- log ic-f actors) 



I 



more- log ic-fact ors : 

" &•&" ( log ic-f act or ) 



log ic-f act or : 

’ !’? (bitwise— expression) 

’ !’? "(" ( log ic-expression) ")" 



bi t wi se_ex pression : 

? (bi t wise-t erm) (more-bit wise- ter ms) * 



more— b i t w i se— terms : 

" I " (bit wise- term) 



bit wise- term : 

( bi t wi se-f act or ) (rnore-bi t wi se-f act ors) * 
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rnor e-b i t w i se- f act ors : 

" " < b i t w i se- f ac t or > 

bi t wise-factor : 

< b i t w i se— e 1 ernen t > <rnor e— b i t w i se-e 1 ernent s> * 

more-bit w i se— e 1 ernent s : 

’ <bit wise-element > 



bitwise-element : 

< cornpare-exr p> 1 

" ( " <bi t wise-expression> " ) " 

cornpare-ex press ion : 

<cornpare-t errn> <rnore-cornpare-t errns> * 

rnore-cornpare-t ernis : 

< e q 1-1 a 1 i t y - o p > < c o rn p a r e - 1 e r rn > 

equal i ty-op : 

II __ II j II I ^ II 

compare— t errn : 

<cornpare-f act or) more— compare— fact ors) * 

rnore-cornpare-fact ors : 

<rel at ion-op) (corn pare- fact or) 



relat ion-op : 

II ^ II j II ^ II I II ^ = j ■< y 

cornpare-f act or : 

<shi ft-expression) I 

" (" (cornpare-expression) ") “ 

shift -expression : 

(lvalue) (shift-op) (ar i th-expression) I 
( ar i t h-e X pr ess i on ) 

sh i ft -op : 

II)) II J ii<<ii 

ari th-expression : 

’ ? (term) (rnore-t errns) * 

rnore-t errns : 

(add-op) (term) 



add-op : 

II _ II j II ^ II 
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term : 

< factor) <more-f act ors> * 

more- factors : 

< rn alt -o p > < fact or > 



mu It -op : 

1 "/" 1 



factor : 

'* ( " <arith-expr> ")" I 

< const ant -ex press! on) I 

<character-def ini t ion) I 

< f unct ion-cal 1 ) I 



< i ncrernent al-expressi on) I 

<lvalue) 

const ant -ex press ion : 

< const ant) I 

<const ant- i dent i f ier) 

lvalue : 

<array-elernent ) I 

< identifier) I 

< po i nt er-e X press i on) 

array-element : 

< i dent i f i er ) < index) 

index : 

“C" <arith-expression) "D" 

SEMANTIC CGNSTRftINTS ; 

(1) Prohibited for extern and parameter dec 1 arat i ons- 
for others. 



Mandat ory 
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APPENDIX B 



TINY-C PftRSER VERSION 1 



extern char bufCH, nextch, func^end; 

extern int bufp, glbptr, line_no; 



prograrn<) /* Tiny-C Program 

•C 



while ( preprcs ( ) ) 

5 

while <data_def ( ) ) 

if ( ! f unc_def ( ) ) 

wh i le ( ! match (EOF) ) 

-C 

f ijnc_end=FfiLSE ; 
if ( ! fi-mc^def ( ) ) 

g c« t o 

> 



goto 



quit ; 



quit ; 



*/ 



ret urn (TRUE) ; 
quit: ret urn ( FALSE ) ; 

> 



preprcs() /* pre-precessor */ 

-C 

int oldp=bufp, 1 inep=l ine_no ; 
g 1 bptr=buf p ; 



if (rnatchtoken ( "#def ine ")) 
•C 



if 


( ! cnst _id ( ) ) 


got o 


quit 


5 

if 


( ! const ant < ) ) 


goto 


quit 


> 

else 


if (mat cht oken < 


"# include 


") ) 


•c 


if 


( ! f i le^def ( ) ) 


got o 


quit ; 


> 


else 




got o 


quit ; 
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quit : 
> 



return (TRUE) ; 

bufp=oldp; 1 ine_no=l inep ; nextch=buf Cbuf p3 

return (FALSE) ; 



file_def() /* file definition 

■C 

int oldp=bufp, 1 i nep= 1 ine_no ; 
char limiter; 



*/ 



if (match (’’”) ) 

5 

else i f (match ( ’ (’ ) ) 
5 

else 

5 



1 irni t er=’ " ’ 
1 irni ter=’ <’ 
goto quit 



if ( ! f i lenarne ( ) ) goto quit; 



if (limiter==’ ) 

■C 

if ( ! mat ch ( ’ " M ) goto quit; 

> 

else if (! match (’> M ) goto quit 

? 

return (TRUE) ; 

quit : buf p=oldp ; 1 ine_no=l inep ; nextch=buf Cbuf p3 

ret urn (FALSE) ; 



fi lenarne () /* file name */ 

-C 

if ( ! idO ) return(FALSE) ; 

if ( f i letype ( ) ) 

? 

return (TRUE) ; 

> 



filetype() /* file type */ 

•C 

int oldp=bufp, 1 inep=l ine_no ; 



if ( ! mat ch ( ’ . ’ ) ) goto quit; 



if ( ! i d ( ) ) 



g o to q 1.1 i t ; 



ret urn (TRUE) ; 

quit : buf p=oldp; 1 ine_no=l inep; next ch=buf C buf pD 

ret urn ( FALSE) ; 

> 



data_def() /* data definition */ 

-C 

int oldp=buf p, 1 inep=l ine_no ; 



g 1 bpt r=buf p ; 
i f ( 5c_spcf r ( ) ) 

m 

if (dclrt ion ( ) ) 

ret urn (TRUE) ; 



quit: bufp=oldp; 1 i ne_no= 1 inep ; nextch=buf C buf p] 

return (FALSE) ; 

> 



5C_spcfr() /* sc specifier */ 

< 

if (rnatchtoken ( "auto ")) 

? 

else if (rnatcht oken ( "st at ic ")) 

? 

else if (mat ch token ( "ext ern ")) 

m 

1 

else if (rnatchtoken ( "register ")) 

? 

else ret urn ( FALSE) 

? 

return (TRUE) ; 

> 



dclrtion() /* declaration */ 

■C 

int oldp=bufp, 1 i nep=l i ne_no ; 



if 


( ! typ_spf ( ) ) 


goto 


quit 


if 


( ! var_dec_l ist ( ) ) 


got o 


quit 



& 



if ( ! match ( ’ 5 ’ ) ) 



goto quit 5 



ret urn (TRUE) ; 

quit : bufp=oldp; 1 ine_no=l inep; next ch=buf Cbuf p] ; 

ret urn (FALSE) 

> 



t yp_spf ( ) 

< 



/* type specifier 



*/ 



if (mat ch to ken ( “char “ ) ) 

else if (mat chtoken ( “short “)) 

else if (mat chtoken (“ int “)) 

else if (mat chtoken (“ long “)) 

5 

else if (matcht oken ( “ unsi gned “)) 

else if (matchtokenC'float “)) 

5 

else if (mat chtoken ( “double “)) 

? 

else ret urn (FALSE) 



ret urn (TRUE) ; 



var_dec_l ist ( ) /* variable declaration list */ 

if ( ! vardclr ( ) ) ret urn (FALSE) ; 

whi le (morevardcls ( ) ) 

ret urn (TRUE) ; 

> 



morevardc 1 s ( > /* more variable declarat ions */ 

< 

int oldp=bufp, 1 inep=l i ne_no ; 

if (! match (’,’) ) goto quit; 

if (! vardclr ()) goto quit; 
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quit : 

> 



ret urn ( TRUE) ; 

buf p==o 1 dp ; 1 i ne_no= 1 i nep ; next ch = buf C buf p3 

ret urn (FALSE) ; 



vardclrC) /* variable declaration 

*C 

int oldp=buf p, 1 inep=l ine_no ; 



i f (match (’*’)) 

5 

if (!id()) goto quit; 

i f ( indxdclr ( ) ) 

5 

if (initializer()) 
ret urn (TRUE) ; 

quit : buf p=oldp; 1 ine_no==l inep; nextch=buf Cbuf p3 

ret urn (FALSE) ; 



indxdclr() /* index declaration 

■C 

int oldp==buf p, 1 inep=l ine__no ; 



if ( ! match ( ’ C’ ) ) 
i f (cnst _expr ( ) ) 
if ( ! match ( ’ 3 ’ ) ) 



goto quit; 



goto quit; 



q Li i t : 

> 



ret urn (TRUE) ; 
buf p==oldp ; 
ret urn (FALSE) 



next ch=buf Cbuf p3 ; 



line no= 



init ial izer ( ) /* initializer */ 

-C 

int oldp=buf p, 1 inep=l ine_no ; 



if ( ! mat ch ( ’ =’ ) ) Q'-'to quit; 

i f (match (’<’)) 



*/ 






1 inep 
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-c 

if <! expression () ) gcrto quit; 

wh i le ( ! rnoreexpr < ) ) 

if ( ! mat ch ( ’ >’ ) ) goto quit; 

> 

else if (! expression () ) goto quit; 
ret urn (TRUE) ; 

quit : bufp=oldp; 1 ine_no=l inep; next ch=buf Cbuf pH ; 

ret urn (FALSE) ; 

> 



func_def() /* function definition */ 

-C 

int oldp=buf p, 1 inep=l ine_no ; 



g 1 bpt r=buf p ; 
i f ( typ_spf ( ) ) 
if ( ! f unc_dclr ( ) ) 
g 1 bpt r=buf p ; 
if ( ! f unc_body ( ) ) 
f unc_end=TRUE ; 



ret urn (TRUE) ; 

quit : buf p=oldp ; nextch = buf Cbuf pH ; 1 ine_no=l inep 

ret urn (FALSE) ; 

> 



func_dclr() /* function declaration */ 

•C 

int oldp=buf p, 1 inep=l ine_no ; 



if (!id()) goto quit; 

if ( ! match (MM) goto qui t ; 

i f ( idnf rs ( ) ) 

5 

if (! match (MM) goto quit; 



goto quit; 



goto quit ; 
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ret urn (TRUE) ; 

quit : buf p=oldp; next ch = buf C buf pD ; 1 ine__no=l inep 

ret urn (FALSE) ; 

> 



idnfrs() /* identifiers */ 

< 

if (!id()) ret urn (FALSE) 

5 

wh i le (rnore__id ( ) ) 

5 

ret urn (TRUE) ; 

> 



rnore_id() /* more identifiers */ 

< 

int oldp=buf p, 1 inep=l ine_no ; 



if ( ! mat ch ( ’ , ’ ) ) got o quit ; 
if (!id()) goto quit; 

return(TRUE) ; 

quit : buf p=oldp; nextch=buf Cbuf pD ; 1 ine_no=l inep 

ret urn (FALSE) ; 

> 



f unc__body ( ) /* f unct ion body */ 

-C 

int oldp=buf p, 1 inep=l ine_no; 



if ( ! type_dec__l st ( ) ) goto quit; 

g 1 bpt r=buf p ; 

if ( ! crnpn_st rnt ( ) ) goto quit; 

ret urn (TRUE) ; 

qui t : buf p=oldp ; nextch = buf Cbuf p3 ; 1 ine_no=l inep 

ret urn (FALSE) ; 

> 
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/* type declaration list */ 



type_dec_lst <) 

< 

int oldp=buf p, 1 inep=l ine_no; 



i f ( par__dc 1 rt i on < ) ) 

? 

while (par_dclrt iori ( ) ) 

? 

ret urn (TRUE) ; 

quit : buf p=oldp ; 1 ine_no=l inep ; nextch = buf Cbuf p3 ; 

ret urn (FftLSE) ; 



par__dclrt ion ( ) /* parameter declarations */ 

int oldp=bufp, 1 inep=l ine_no ; 



if 

if 

if 

quit : 

> 



( ! typ_spf ( ) ) 

( ! par_dec__ 1 i st ( ) ) 

( ! match (’;’)) 

return (TRUE) ; 

buf p=oldp ; 1 ine_no 

ret urn (FftLSE) ; 



goto qui t ; 
goto quit; 
goto quit; 

1 inep ; next ch = buf Cbuf p3 ; 



par__dec_l ist ( ) /* parameter declaration list */ 

if ( ! paramet er ( ) ) ret urn (FPLSE) 

? 

wh i le (more par dels ( ) ) 

? 

ret urn (TRUE) ; 

> 



rnorepardcls ( ) /* more parameter declarations */ 

int oldp=buf p, 1 inep=l ine__no; 



if 


( ! match (’,’)) 


got o 


quit ; 


if 


( ! parameter ( ) ) 


goto 


quit ; 



66 



quit 

> 



ret urn (TRUE) ; 

bufp=oldp; 1 irie_no=l inep ; next ch = buf C buf pD ; 

ret urn (FPLSE) ; 



pararneter() /* parameter 

int oldp=buf p, 1 inep=l ine_no ; 



i f (match (’*’)) 

if ( ! id ( ) ) 

if ( i ndxdclt' ( ) ) 

9 



quit : 

> 



goto quit; 



ret urn (TRUE) ; 

buf p=oldp ; 1 ine_no=l inep ; 

ret urn (FALSE) ; 



*/ 



next ch=buf C buf p3 ; 



stmt ( ) 
-C 



/* statement 






if (cmpn 


_st mt ( ) ) 


9 

else 


if 


(if^stmt () ) 


el se 


i f 


( wh i 1 e_st mt ( ) ) 


? 

el se 


if 


(do__strnt ( ) ) 


? 

el se 


if 


(for_strnt ( ) ) 


5 

el se 


if 


(swt c_strnt ( ) ) 


? 

else 


if 


( break_st mt ( ) ) 


5 

e 1 se 


if 


(rnatchtoken ( "c 


■C if 


( ! match (’ ;’ ) ) 


> 


else 


if 


(rt rn_strnt ( ) ) 


? 

else 


if 


( got o_strnt ( ) ) 


? 

else 


if 


( funereal 1 ( ) ) 


-C if 


( ! mat ch ( ’ ; ’ ) ) 


> 


else 


if 


(asnrnt ( ) ) 


< if 


( ! match (’;’)) 


> 


else 


if 


(label ( ) ) 



goto quit; 



goto quit; 
goto quit; 
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else if (match ( ’ ; ’ ) ) 

? 

else quit 

? 

ret urn (TRUE) 5 
quit: ret urn (FALSE) ; 

> 



crnpn_st nit ( ) /* compound statement */ 

int oldp=buf p, 1 inep=l ine_no ; 
if ( ! mat ch (’<’)) got o q u i t ; 

whi le (dclrt ion ( ) ) 

if (!strnt()) goto quit; 

while (stmt ( ) ) 

if ( ! mat ch ( ’ >’ ) ) goto quit; 

ret urn ( TRUE) ; 

quit : bufp=oldp; next ch = buf Cbuf pD ; 1 ine__no=l inep; 

return (FALSE) ; 

> 



func__call() /* function call */ 

-C 

int oldp=buf p, 1 inep=l ine_no ; 



if 


( ! id 0 ) 


goto 


quit ; 


if 


( ! match (’ (’ ) ) 


goto 


quit; 


if 


(expr_lst ( ) ) 






5 

if 


( ! match (’)’)) 


goto 


quit ; 



ret urn (TRUE) ; 

quit : bufp=oldp; next ch=buf Cbuf p 3 ; 1 ine_no=l inep; 

ret urn (FALSE) ; 
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expr_lst() /* expression list */ 

•C 

if ( ! expression < ) ) ret urn (FPLSE) 

5 

wh i le (rnoreexpr ( ) ) 

5 

ret urn (TRUE) ; 

> 



rnoreexprC) /* more expressions 

int oldp=buf p, 1 inep=l ine_no ; 



if ( ! mat ch ( ’ , ’ ) ) goto quit; 

if <! expression () ) goto quit; 



ret urn (TRUE) ; 

quit : bufp=oldp; next ch=buf Cbuf pD ; 

ret urn (FPLSE) ; 

> 



asnmt ( ) /* assignment statement */ 

i f ( assi gn ( ) ) 

5 

else if ( incr_stmt ( ) ) 

? 

e 1 se ret urn ( FftLSE) 

5 

ret urn (TRUE) ; 

> 



assign() /* simple assignment */ 

int oldp=bufp, 1 inep=l ine_no ; 



if (!lvalue()) goto quit; 

i f (match ( ’ =M ) 

< if ( ! 1 gc_expr ( ) ) goto quit; 

> 






1 ine__no=l inep 
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goto quit; 



else if (shf_asrn_op ( ) ) 

■C if ( ! shf_eKpr ( ) ) 

> 

else if ( bt w_asrn_op ( ) ) 

-C if ( ! bt w_expr ( ) ) goto quit; 

> 

else goto quit 

5 

ret urn (TRUE) ; 

quit : buf p=oldp; next ch=buf Cbuf p3 ; 1 ine_no=l inep 

ret urn (FOLSE) ; 

> 



shf _asrn_op ( ) /* shift assignment operator */ 



if (rnatcht oken ( "•+•= ",0)) 



5 

else 


i f 


(mat chtoken ( 


II 


", 0) ) 


5 

else 


if 


(rnatchtoken ( 


"*= 


", 0) ) 


? 

else 


if 


(rnatcht oken ( 


"/= 


", 0) ) 


5 

else 


if 


(rnatcht oken ( 


"■/,= 


", 0) ) 


5 

else 


if 


(mat cht oken ( 


"> > = 


", 0) ) 


? 

else 


if 


(mat chtoken ( 


" < <= 


", 0) ) 


? 

e 1 se 

• 

1 




ret urn ( FOLSE ) 





ret urn (TRUE) ; 

> 



bt w_asrn_op ( ) /* bitwise assignment operator */ 

< 

if (rnatchtoken ( " &= ",0)) 

? 

else if ( rnatcht oken ( "-^= ",0)) 

else if (rnatchtoken < " I = ",0)) 

5 

else return (FALSE) 

5 

ret urn (TRUE) ; 

> 
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incr_strnt() /* i ncrement a 1 statement*/ 

-C 

int oldp=buf p, 1 inep=l ine_no ; 

char pre_op=TRUE; /* pre— operator */ 



if (rnatcht o ken ( "•+••+• '*,0)) 

? 

else if (matchtoken ( " — ",0)) 

? 

e 1 se pr e_op=F ALSE 

? 

if (!lvalue()) goto quit; 



if ( ! pre_op) 

•C 

if (mat ch token ( "•+•-♦- ",0)) 

? 

else if (matchtoken ( " — ",0)> 

? 

else goto quit ; 

> 



return(TRUE) ; 

quit; bufp=oldp; nextch=buf Cbuf p3 ; 

return (FftLSE) ; 

> 



1 i ne_no= 1 inep 



if_strnt() /* if statement */ 

-C 



if 


(! matchtoken (" if ",!)) 


goto 


quit 


if 


( '.match (’ (’ ) ) 


g III t o 


quit 


if 


( ! 1 gc_expr ( ) ) 


g Cl t o 


quit 


if 


( ! match (MM) 


goto 


q i.i i t 


if 


( ! st rnt ( ) ) 


goto 


quit 


if 


(else stmt ( ) ) 







5 



return (TRUE) ; 
quit: return (FftLSE) ; 

> 
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else_stmt ( ) 

< 



/* else statement 



M-/ 



if ( ! rnatchtoken ( "else ",!)) 
if ( ! st mt ( ) ) 



goto quit ; 
goto quit ; 



quit : 
> 



return(TRUE) ; 
ret urn (FALSE) ; 



whi le_strnt ( ) 
■C 



/* while statement 



*/ 



if 


( 


! rnatchtoken ( " wh i le " , 1 ) ) 


goto 


q i.i i t 


if 


( 


! match ( ’ ( ’ ) ) 


goto 


q la i t 


if 


( 


! 1 gc_expr ( ) ) 


goto 


quit 


if 


( 


! match (MM) 


goto 


quit 


if 


( 


! strut ( ) ) 


goto 


quit 



return(TRUE) ; 
quit: ret urn (FALSE) ; 

> 



do_st rnt ( ) 
■C 



/* do statement 



*/ 



if 


( 


! rnatchtoken ("do ",D) 


goto 


quit 


if 


( 


! strut ( ) ) 


goto 


quit 


if 


( 


! rnatchtoken ( "whi le ",!)) 


goto 


quit 


if 


( 


! mat ch ( ’ ( ’ ) ) 


goto 


quit 


if 


( 


! 1 gc_ex pr ( ) ) 


goto 


quit 


if 


( 


! match (’)’)) 


goto 


q 1-1 i t 


if 


( 


! match < ’ ; ’ ) ) 


goto 


quit 



q i.i i t : 

> 



ret urn (TRUE) ; 
ret urn (FALSE) ; 
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for_. 

•C 


stmt ( ) 


/* 


for statement 






if 


( ! mat cht oken ( " 


for 


", 1) ) 


goto 


quit ; 




if 


( ! match (’ (’ ) ) 






goto 


quit ; 




if 


(asn_lst ( ) ) 












if 


( ! match ( ’ ; ’ ) ) 






goto 


quit ; 




if 


^ 1 gc_expr ( ) ) 






goto 


quit ; 




if 


( ! match ( ’ ; ’ ) ) 






goto 


quit ; 




if 


( asn_l st ( ) ) 












? 

if 


( ! match (’)’)) 






got Cl 


quit ; 




if 


( ! stmt ( ) ) 






goto 


quit ; 


q la i t 
> 


; 


ret urn (TRUE) ; 
ret urn (FALSE) 










asn_ 

< 


1st 


() 


/* 


assi gnment 


list 


*/ 




if 


( ! asnmt ( ) ) 




Y'eturn (FfiLSE) 






9 

wh i le (rnore_asnrnt ( ) ) 








> 


ret urn (TRUE) ; 










more 

< 


_asnrnt ( ) /* 

int oldp=bufp, linep= 


more assi gnment s 
= 1 ine_no ; 


*/ 




if 


( ! mat ch ( ’ , ’ ) ) 




go t o 


quit ; 






if 


( ! asnmt ( ) ) 




goto 


quit ; 





ret urn (TRUE) ; 

quit : bufp=oldp; next ch=buf Cbuf pD ; 1 ine_no=l inep 

ret urn (FftLSE) ; 

> 
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swtc_strnt() /* switch statement */ 

< 



if 


(! rnatchtoken ( "switch ",!)) 


got o 


quit 


if 


( ! match ( ’ ( ’ ) ) 


goto 


quit 


if 


( ! art_expr ( ) ) 


goto 


quit 


if 


( ! match (’)’)) 


goto 


quit 


if 


( ! match (’ -C’ ) ) 


g c< t c< 


quit 


if 


( ! case_strnt ( ) ) 


got o 


quit 



while (case_strnt ( ) ) 

1 

if (! mat ch (’>'’) ) goto quit; 



ret urn (TRUE) ; 
quit; ret urn ( FPLSE ) ; 

> 



case_strnt<) /* case statement */ 

-C 



if ( rnatcht oken < " case ",!)) 

-C if ( ! cnst __expr ( ) ) goto quit; 

> 

else if (rnatchtoken ( "default ",0)) 

5 

else goto quit 



5 

if ( ! match (’ ; ’ ) ) 

wh i le (strut ( ) ) 

5 



goto quit; 



ret urn (TRUE) ; 
quit; ret urn (FOLSE) ; 
> 



break_strnt() /* break statement */ 

C 

if (! rnatchtoken (" break ",!)) goto quit; 

if (! match (’;’) ) goto quit; 
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ret urn (TRUE) ; 
qui t : return (FftLSE) ; 

> 



rtrn_strnt() /* return statement */ 

-C 



if 


( ! match token < "ret urn ", 1 ) ) 


got o 


quit ; 


if 


(expression ( ) ) 






if 


( ! match ( ’ ; ’ ) ) 


goto 


q Li i t ; 




ret urn (TRUE) ; 
ret urn (FALSE) ; 







goto_strnt() /* goto statement */ 

< 



if 


( ! mat ch token ( “ goto “ , 1 ) ) 




goto quit; 


if 


'( ! i d ( ) ) 




goto quit ; 


if 


( .'match (’;’)) 




goto quit ; 


quit : 
> 


ret urn (TRUE) ; 
return (FPLSE) ; 






label ( ) 
< 

int 


/* 

oldp=buf p, 1 inep=l ine_no ; 


label 




if 


( ! id () ) 


got o 


quit ; 


if 


( ! match (’:’)) 


goto 


quit ; 



ret urn (TRUE) ; 

quit : buf p=oldp; nextch = buf Cbuf p3 ; 1 ine_no=l inep 

return(FftLSE) ; 

> 
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ex pr'ess i on ( ) /* expression */ 

if (str ing ( ) ) 

? 

else if ( pnt r__expr ( ) ) 

5 

else if (addr_expr < ) ) 

5 

else if ( 1 gc__expr ( ) ) 

? 

else if ( incr_st mt ( ) ) 

? 

else ret urn (FftLSE) 



ret urn (TRUE) ; 

> 



pntr__expr() /* pointer expression */ 

int oldp=buf p, 1 inep=l ine__no; 



if < ! mat ch < ’ *’ ) ) goto quit; 

i f ( array _e 1 rn ( ) ) 
else if (idO) 
e 1 se i f < mat ch ( ’ < ’ ) ) 



if ( ! art _expr ( ) ) 


goto 


quit ; 


if ( ! match (’)’)) 


ID 

0 

0 


quit ; 


> 






else 

5 


goto 


quit 


ret urn (TRUE) ; 







quit: bufp=oldp; nextch=buf Cbuf pD ; 1 i ne__no= 1 i nep 

ret urn (FPLSE) ; 

> 



addr__expr<) /* address expression */ 

int oldp=buf p, 1 inep=l ine__no; 
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if ( ! match ( ’ &’ ) ) goto quit ; 

if (array_elm ( ) ) 

m 

9 

else if ( id ( ) ) 

else goto quit 

1 

ret urn (TRUE) ; 

quit; bufp=oldp; nextch=buf Cbuf pD ; 1 ine_no= 1 inep 

ret urn (FALSE) ; 

> 



lgc_expr() /* logic expression */ 

■C 

if ( ! lgc_trrn( ) ) ret urn (FALSE) 

5 

wh i 1 e (1 g_t rms ( ) ) 

5 

ret urn (TRUE) ; 

> 



lg_trrns() /* logic terms 

int oldp=buf p, 1 inep=l ine_no ; 



if ( ! mat cht oken ( '* I I “,0)) goto quit; 
if ( ! 1 gc_t rrn ( ) ) goto quit; 



ret urn (TRUE) ; 

quit : bufp=oldp; next ch=buf Cbuf p3 ; 

ret urn (FPLSE) ; 

> 



lgc_trrn() /* logic term */ 

•C 

if (!lgc_fctO) return (FALSE) 

5 

while (lg_fcts()) 

5 

ret urn (TRUE) ; 

> 



*/ 



1 i ne_no=l i nep 
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/* logic factors 



lg_fcts ( ) 

■C 

int oldp=bLifp, 1 i nep=l irte_no ; 



*/ 



if 


(! match token (" && ",0)) 


g c« t o 


quit ; 


if 


( ! 1 gc_f ct ( ) ) 


got o 


quit ; 




ret urn (TRUE) ; 







quit: bufp=oldp; nextch = buf Cbuf pD ; 1 i rie_no= 1 i nep 

ret urn (FPLSE) ; 



lgc_fct() /* logic factor */ 

int oldp=buf p, 1 inep=l ine_no ; 

if ( match (’ !M) /* unary operator */ 

i f ( bt w_expr ( ) ) 

else i f (match ( ’ ( ’ ) ) 

< 

if ( ! 1 gc_expr ( ) ) goto quit; 

if ( ! match (’)’)) goto quit ; 

> 

else goto quit 

1 

ret urn (TRUE) ; 

quit : buf p=o 1 dp ; next ch = buf CbufpD; li ne_no= 1 i nep 

ret urn (FPLSE) ; 

> 



btw_expr() /* bitwise expression 

< 

int oldp=buf p, 1 inep=l ine_no ; 

if (match (’ ) 

1 

if (!btw_trmO) goto quit; 

while (bt_trms<)) 

1 



ret urn (TRUE) ; 

quit : buf p=oldp ; next ch = buf C buf pH ; 

ret urn (FPLSE) ; 

> 



*/ 



1 ine_no=l inep 
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/* bitwise terms 



bt _t rrns ( ) 

-C 

int Qldp=bi.if p, 



1 inep=l ine__no 



1 



if 


( ! match ( ’ 1 ’ ) ) 


goto 


quit 


if 


( ! btw_trrn ( ) ) 


goto 


quit 



ret urn (TRUE) ; 

quit: bufp=oldp; next ch=buf Cbuf pH ; 

ret urn (FftLSE) ; 



*/ 



1 i ne_no= 1 inep 



btw_trrn() /* bitwise term */ 

if ( ! btw__f ct ( ) ) ret urn (FPLSE) 

5 

while (bt_fcts()) 

5 

ret urn (TRUE) ; 

> 



bt_fcts() /* bitwise factors */ 

int oldp=buf p, 1 inep=l ine_no; 



if ( ! mat ch ( ’ ^"*’ ) ) goto quit; 

if (!btw_fct()) goto quit; 

ret urn (TRUE) ; 

quit : buf p=oldp; next ch=buf Cbuf pH ; 1 ine_no=l inep 

ret urn (FPLSE) ; 

> 



btw_fct() /* bitwise factor */ 

if ( ! btw_elrn ( ) ) ret urn (FPLSE) 

5 

while ( bt _e 1 ms ( ) ) 

5 

ret urn (TRUE ) ; 

> 
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/* bitwise elements 



bt __e Irns ( ) 

int oldp=buf p, 1 inep=l ine_no; 






if (! match (’&’)) goto quit; 

if < ! btw_elrn ( ) ) goto quit; 

ret urn (TRUE) ; 

quit : bufp=oldp; nextch=buf Cbuf pD ; 1 ine_no=l inep 

return(FPLSE) ; 



btw_elrn() /* bitwise element */ 

int oldp=buf p, 1 inep=l ine_no; 
i f (cmp_expr ( ) ) 



e 1 se i f ( mat ch ( ’ ( ’ ) ) 
< 






if ( ) bt w_expr ( ) ) 


goto 


quit ; 


if ( ! match (MM) 


g o t c* 


q 1-1 i t ; 


> 

else 


got o 


q 1-1 i t 



ret urn (TRUE) ; 

quit : bufp=oldp; next ch=buf Cbuf p3 ; 1 ine_no=l inep 

return (FPLSE) ; 



crnp_expr() /* compound expression */ 

-C 

int oldp=buf p, 1 inep=l ine_no ; 



if ( ! crnp_trm ( ) ) ret urn (FPLSE) 

5 

while <cp_trrns()) 

5 

ret urn (TRUE) ; 

> 
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/* compound terms 



cp_t rrns ( ) 
-C 



int 


oldp=buf p, 


1 i nep=l i ne_no ; 


if 


( ! equ_op ( ) ) 


got o 


if 


( ! crnp_t rrn ( ) ) 


got o 



q Li i t ; 
quit ; 



*/ 



quit : 

> 



ret urn (TRUE) ; 

bufp=oldp; next ch=buf Cbuf pD ; 

ret urn (FftLSE) ; 



1 ine_no=l inep 



equ_op() /* equality operators */ 

if (mat cht oken ( '*== ",0)> 

else if (mat chtoken ( '* ! = ",0)> 

5 

else ret urn (FftLSE) 

? 

ret urn (TRUE) ; 

> 



cmp_trrn() /* compound term */ 

•C 

if ( ! crnp_f ct ( ) ) ret urn (FftLSE) 

5 

while (cp_fcts()) 
ret urn (TRUE) ; 

> 



cp_fcts() /* compound factors */ 

-C 



if 


( ! rel_op ( ) ) 


goto 


quit ; 


if 


( ! cnip_fct ( ) ) 


goto 


quit ; 



quit : 

> 



ret urn (TRUE) ; 
ret urn ( FALSE ) ; 



rel_op() /* relational operator */ 

< 

i f (match ( ’ <M ) 

< if (match (’ ) ; 

> 

else if (mat ch ( ’ > ’ ) ) 

<. if (match (’=’)); 

> 

else return (FftLSE) 

5 

ret urn (TRUE) ; 

> 



cmp_fct() /* compound factor */ 

int oldp=buf p, 1 inep=l ine_no ; 



i f (shf__expr ( ) ) 

5 

e 1 se i f ( mat ch ( ’ ( ’ ) ) 

< 

if ( ! crnp__expr ( ) ) 
if ( I match (’ ) M ) 

> 

else 

? 

return (TRUE) ; 

quit: bufp=oldp; nextch 

ret urn (FALSE) ; 

> 



shf__expr() /* shift expression */ 

int oldp=buf p, 1 inep=l ine_no 5 

if (shf _ini t ( ) ) 

5 

if ( ! art __expr ( ) ) goto quit 

5 

ret urn (TRUE) ; 

quit : buf p=ol dp ; next ch = buf CbufpD; li ne__no= 1 i nep 

ret urn (FALSE) ; 

> 



goto quit; 
goto quit; 
goto quit 

= buf Cbuf p 3 ; 1 ine_no=l inep 
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shf_init() /* shift expression-initial */ 

■C 

int oldp=bufp, 1 i nep= 1 ine_no ; 



if 


( ! 1 va 1 ue ( ) ) 


g Cl t o 


q 1-1 i t 


if 


( ! shf_op ( ) ) 


goto 


q 1-1 i t 



ret urn (TRUE) ; 

quit: bufp=oldp; nextch=buf Cbuf pD ; 1 ine_no=l inep 

ret urn (FALSE) ; 



shf_op() /* shift operator */ 

■C 

if (rnatchtoken ( "> > ",iZD) 

5 

else if (rnatchtoken ("< ( ",iZD) 

5 

else ret urn (FALSE) 

5 

ret urn (TRUE) ; 

> 



art_expr() /* arithmetic expression */ 

int oldp=buf p, 1 inep=l ine__no ; 

if (match (’-M) /* unary operator */ 

? 

if (!term()) goto quit; 

while (more_term() ) 

5 



return (TRUE) ; 

quit ; bufp=oldp; next ch=buf C buf p3 ; 

ret urn (FftLSE) ; 

> 



1 ine_no=l inep 



more__term() /* more terms */ 

■C 

if (!add_op()) goto quit; 

if (IterrnO) goto quit; 
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ret urn ( TRUE ) ; 
quit: ret urn ( FOLSE) 5 

> 



add_op() /* additional operator */ 

■C 

i f ( match (’+’)) 

else if (match (’-’) ) 

5 

else return (FOLSE) 



ret urn (TRUE) ; 

> 



t errn ( ) 

< 

if ( ! factor ( ) ) 

? 

while (rnore__fcts ( ) ) 
? 

ret urn (TRUE) ; 

> 



rnore_f ct s ( ) 


/* more 


factors */ 










int 


oldp=buf p, 


1 inep=l ine. 


_no ; 


if ( 


! rnul_op ( ) ) 




goto quit; 


if ( 


! f act or ( ) ) 




goto quit; 



ret urn (TRUE) ; 

quit: bufp=oldp; next ch=buf Cbuf p] ; line_no 

ret urn (FftLSE) ; 

> 



rnul_op() /* rnul t i pi icat ional operator 

i f (match (’*’)) 

5 

else if (match (’/’) ) 



/* term */ 

ret urn (FftLSE) 



1 i nep 



*/ 
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else if (match (’%’) ) 

? 

else return (FPILSE) 

• 

ret urn (TRUE) ; 

> 



factor() /* factor */ 

< 

int oldp=bufp, 1 inep=l ine_no ; 



if (match (’ (’ ) ) 
■C 



if 


( ! 


art _expr ( ) ) 


g C' t o 


quit 


if 


( ! 


match (’)’)) 


got o 


q i-i i t 


> 

else 


if 


( funereal 1 ( ) ) 






5 

else 


if 


(enst __expr ( ) ) 






5 

else 


if 


( char_def ( ) ) 






? 

else 


if 


( i ncr _strnt ( ) ) 






5 

else 


if 


( 1 val ue ( ) ) 






? 

else 






got o 


q la i t 



ret urn (TRUE) ; 

quit: bufp=oldp; nextch=buf Cbuf pD ; 1 ine_no=l inep 

return (FfiLSE) ; 

> 



cnst_expr() /* constant expression */ 

< 

i f ( const ant ( ) ) 

? 

else if ( ! cnst _i d ( ) ) 

return(FftLSE) ; 

ret urn (TRUE) ; 

> 
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lvali.ie() /* left value */ 

■C 

int oldp=bufp, 1 inep=l ine_no ; 

char prnthsi s=FflLSE ; 



if (match (’(’) ) prnthsi s=TRUE ; 

i f ( array_e 1 rn < ) ) 

else if ( id ( ) ) 

? 

else if (pntr_expr ( ) ) 

5 

else goto quit 

if (prnthsis) 

if (! match ( M M ) goto quit; 

ret urn (TRUE) ; 

quit : buf p=oldp; nextch=buf Cbuf pD ; 1 ine_no=l inep 

ret urn (FftLSE) ; 



array_e 1 rn ( ) 


/* array element 


-C 






int oldp=bufp, 


1 inep=l ine_no ; 




if ( 1 id () ) 


got o 


quit ; 


if ( ! index ( ) ) 


goto 


quit; 



ret urn (TRUE) ; 

quit : buf p==oldp; nextch=buf Cbuf p3 ; 1 ine_no=l inep 

ret urn (FALSE) ; 



index () /* index expression for arrays */ 

•C 

int oldp=bufp, 1 inep=l ine_no; 



if (! match (’ CM ) goto quit; 

i f (art _expr ( ) ) 

if (! match (’ D ’> ) goto quit; 
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ret urn (TRUE) ; 

quit : buf p=ol dp ; next ch = buf Cbuf pD ; 1 i ne__no= 1 i nep 

ret urn (FftLSE) ; 



primary() /* primary expression */ 

-C 

if (cnst _expr ( ) ) 

else if (array_elrn ( ) ) 

5 

else if ( id ( ) ) 

5 

else if (char_def()) 

5 

else if (string()) 

• 

else ret urn ( FALSE ) 



ret urn (TRUE) ; 



as 



APPENDIX C 



TERMINALS AND BflSI'C NONTERMINALS 



isaltr <c) 
int c; 

■C 

ret urn ( ( 

> 



/* is a letter? 

/* character to test 

c> =’ A’ &•& c < = ’ Z’ ) II < 



*/ 

*/ 



c> =’ a’ && c < = ’ 2 ’ 



) ) ; 



iscapch <c) 
int c ; 

■C 

ret urn 

> 



/* is a capital letter? 
/* character to test 

c> =’ A’ && c < = ’ Z’ ) ; 



*/ 

*/ 



isadgt(c) /* is a digit? */ 

int c; /* character to test */ 

< 

return ( c>=’0’ && c < = ’ 9’ ); 

> 



isidch<c) /* is identifier character? */ 

int c; 

-C 

ret urn ( isaltr (c) II isadgt (c) II c==’ ); 

> 



del i miter ( ) 
< 



/* is next-character a delimiter? 



*/ 





ret urn < 


next ch== 


’ II 


nextch== ’ 


<’ 


1 1 


nextch== ’ 


> ’ 






next ch== 


’ 4-’ II 


nextch== ’ 


— 1 


1 1 


nextch== ’ 


1 ’ 






next ch== 


’ II 


nextch== ’ 


/’ 


1 1 


nextch== ’ 








next ch== 


’ ; ’ II 


nextch== ’ 




1 1 


nextch== ’ 


. 9 






next ch== 


’ ) ’ II 


nextch== ’ 


••••. ^ 


1 1 


nextch== ’ 


1 1 






next ch== 


’ •/-’ 1 1 


nextch== ’ 


(’ 


1 1 


nextch== ’ 


) ’ 






next ch== 


’ C’ II 


nextch== ’ 


1 ’ 


1 1 


nextch==’ - 




> 




whtchr (nextch ) ) 


? 










whtchr (c) 


/* is 


a white 


-character 


9 


*/ 






int 


c; 


/* character 


to test 




*/ 






V 


ret urn 


< c==’ ’ 1 


1 c==TAB 


1 1 c==CR 


1 1 


IL 

II 

II 

U 


) ! 
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char_def() /* is character definition? */ 

char blank=FftLSE; /* boolean var. for blanks */ 

delwht(); /* skip white characters */ 

/* character definition should start with character */ 

if ( .'match (’ \ ’ ’ ) ) return (FALSE) 

5 

/* consume the following white characters, if there is any */ 

while ( wht chr (next ch ) ) 

nextch = getchr(); 
blank = TRUE; 

> 



if ( nextch==’ \’ ’ ) 

/* check if character body is empty */ 

if ( ! bl ank ) 

/* illegal character definition */ err_msg ( I CDF ) 

/* else it is a blank character */ 
else 
-C 

next ch=get chr ( ) ; 
ret urn ( TRUE) ; 

> 

> 

/* if met ’ character, parse one more */ 
if ( nextch==’\\’ ) next ch=getchr ( ) 

5 

/* parse the original character */ 
next ch=get chr ( ) ; 



/* should finish with " character! */ 

if ( Imatch (’ \ ’ ’ ) ) 

/* illegal character definition */ err_rnsg ( ICDF ) 



ret urn (TRUE) ; 
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string() /* is string? */ 

-C 

char i; /* index variable */ 

delwht(); /* skip white characters */ 



/* string must start with character */ 

if ( ! match (’"’)) ret urn (FPLSE) 



/* since strings not implemented in Tiny-C, just consume it */ 

for <i=0; ( nextch!=’"M && ( i <=MXSTR ); ++i ) 

nextch=get chr ( ) ; 



/* check if it is too long */ 
if ( i > MXSTR ) 

/* string length too long */ err_msg(SLTL) 
? 

/* should finish with character */ 

match 

ret urn (TRUE) ; 

> 



constant (> /* integer constant */ 

■C 

char i=0; /* index variable */ 

delwht () ; /* skip white characters */ 

/* it should start with a digit */ 

if ( ! i sadgt ( nextch ) ) ret urn ( FALSE ) 

1 

while ( isadgt (next ch ) ) /* parse the number */ 

■C 

/* check if number length is too long */ 
if (i>=MXNML) 

/* number length too long */ err_msg(NLTL) 

5 

else nurn name C i ++D =next ch 
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next ch=get chr ( ) ; 



> 

if (nextch!=’ ’ ! del irni t er ( ) ) 

/* delimiter was expected */ err__msg ( DWEX ) 

5 

nurn_narne C i D =’ ’ ; 

/* convert string “ nurn^narne " into numeric value */ 
st r_num ( ) ; 

/* add number into constant table */ 
add_nurn() ; 

return(TRUE) ; 



> 



cnst_id() /* constant identifier */ 

int oldp, linep; 

char i =05 /* index variable */ 

delwht(); /* skip white characters */ 

oldp=buf p; 1 inep=l ine__no; 
nex tch=buf C buf pD ; 



/* first character should be a capital letter */ 
if ( ! iscapch (nextch) ) ret urn (FftLSE) 



whi le < iscapch (nextch) ) 

-C 

/* check if identifier length is too long */ 
if (i>=MXIDL) 

/* identifier length too long 
5 

e 1 se 
5 

next ch=get chr ( ) ; 



/* if following character is still a letter, it can be a lower 
letter only, since Tiny-C assumes constant identifiers are 
all capital letters, this cannot be a constant identifier */ 



*/ warning ( I LTD 

id name C in--*-] =next ch 
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if ( isal tr (nextch ) ) 



goto quit 






if (nextch !=’ ’ && ! del irni t er ( ) ) 

/* delimiter was expected */ err_rnsg (DUIEX ) 

5 

id_narne C i U =’ ’ ; 

ret urn (TRUE) ; 

/* backtrack on the scanner buffer, and return FALSE */ 

quit: . bufp=oldp; 1 ine_no=l inep ; next ch=buf Cbuf p3 ; 

ret urn (FALSE) ; 

> 



idO /* is identifier? */ 



char i=0; /* index variable */ 

delwht(); /* skip white characters */ 

/* should start with a letter */ 

if ( ! isaltr (nextch) ) ret urn (FALSE) 

? 

wh ile (isidch (nextch ) ) 

-C 

/* check if identifier length is too long */ 
if (i>=MXIDL) 

/* identifier length too long */ warning ( ILTL) 

5 

else id_narneC i-+-+D =next ch 

5 

nextch=get chr ( ) ; 

> 

/* following character must be a delimiter! */ 

if (nextch!=’ ’ !delimiter() ) 

/* delimiter was expected */ err_msg ( DWEX ) 

? 

id_narne C i 3 = ’ ’ ; 

/* if identifier is a reserved word, give error message */ 
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*/ err_rnBg (RVNE) 



if ( ! rsvr_t est ( ) ) 

/* reserved word not expected 
? 

ret urn (TRUE) ; 

> 
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APPENDIX D 



TINY-C COMPILER ERROR AND EARNING MESSAGES 



Error Messages; 



#def i ne 


AVNI 


1 


/* 


auto variables not implemented 


*/ 


#def ine 


SVNI 


c! 


/* 


static variables not implemented 


*/ 


#def ine 


EVNI 


3 


/* 


external variables not implemented 


*/ 


#def ine 


RVNI 


4 


/* 


register variables not implemented 


*/ 


#def ine 


SMEX 


5 


/* 


semicolon was expected 


*/ 


#def ine 


LINI 


6 


/* 


long integers not implemented 


*/ 


#def i ne 


UINI 


7 


/* 


unsigned integers not implemented 


*/ 


#def i ne 


FPNI 


a 


/* 


floating points not implemented 


*/ 


#def i ne 


DPNI 


9 


/* 


double precisions not implemented 


*/ 


#def ine 


IDEX 


10 


/* 


identifier was expected 


*/ 


#def i ne 


IBSB 


1 1 


/* 


index body was supposed to be blank 


*/ 


#def ine 


RSBE 


12 


/* 


right square bracket was expected 


*/ 


#def i ne 


CINI 


14 


/* 


compound initiali zers not implemented 


#def i ne 


EXEX 


15 


/* 


expression was expected 


*/ 


#def i ne 


LCBE 


la 


/* 


left curly bracket was expected 


*/ 


#def ine 


LPEX 


17 


/* 


left parenthesis was expected 


*/ 


#def ine 


RPEX 


la 


/* 


right parenthesis was expected 


*/ 


#def ine 


lEAC 


19 


/* 


identifier was expected after comma 


*/ 


#def ine 


EEAC 


20 


/* 


expression was expected after comma 


*/ 


#def ine 


PTNI 


21 


/* 


pointers not implemented 


*/ 


#def i ne 


ARNI 


•“i O 
l— L— 


/* 


arrays not implemented 


*/ 


#def ine 


IPNI 


55 


/* 


include preprecsr not implemented 


*/ 


#def ine 


FTEX 


23 


/* 


filetype was expected 


*/ 


#def ine 


IVFD 


24 


/* 


invalid file definition 


*/ 


#def i ne 


RCBE 


25 


/* 


right curly bracket was expected 


*/ 


#def ine 


PREX 


26 


/* 


parameter" was expected 


*/ 


#def ine 


PREC 


27 


/* 


parameter expected after comma 


*/ 


#def ine 


AEAC 


2a 


/* 


an assignment expected after comma 


*/ 


#def i ne 


LPEI 


29 


/* 


left parenthesis expected after if 


*/ 


#def ine 


LPEW 


30 


/* 


left prnthsis. expected after while 


*/ 


#def ine 


LPEF 


31 


/* 


left parenthesis expected after for 


*/ 


#def ine 


LPES 


41 


/* 


left prnthesis- expected after switch-^/ 


#def ine 


ILEI 




/* 


illegal logic expression in if 


*/ 


#def ine 


I LEW 




/* 


illegal logic expression in while 


*/ 


#def ine 


ILEF 


34 


/* 


illegal logic expression in for 


*/ 


#def ine 


WMFD 


35 


/* 


while is missing from do 


*/ 


#def ine 


SSFI 


36 


/* 


a statement should follow after if 


*/ 


#def ine 


SSFE 


37 


/* 


a statement should follow after else 


*/ 


#def i ne 


SSFW 


38 


/* 


a statement should follow after while*/ 


#def ine 


SSFF 


39 


/* 


a statement should follow after for 


*/ 


#def ine 


SSFD 


54 


/* 


a statement should follow after do 


*/ 


#def ine 


SMIF 


40 


/* 


semicolon is missing in for 


*/ 


#def i ne 


lAES 


42 


/* 


illegal ar ith. expression in switch 


*/ 


#def ine 


CSMS 


43 


/* 


case statement is missing 


*/ 


#def ine 


CLIM 


44 


/* 


CO Ion is missing 


*/ 



9& 



#def ine 


ICEC 


45 


/* 


invalid constant exprs- after case 




#def ine 


PlENI 


46 


/* 


address expression not implemented 


*/ 


#def ine 


□CNI 


47 


/* 


one’s complement not implemented 


*/ 


#def ine 


BON I 


48 


/* 


bitwise operators not implemented 




#def i ne 


SEN I 


49 


/* 


shift expressions not implemented 




#def i ne 


iNPE 


50 




inval id pointer expression 




#def i ne 


INAE 


51 


/* 


invalid address expression 




#def ine 


UNVR 


52 


/* 


unknown var iab 1 e 




#def i ne 


lAEI 


52 


/* 


invalid ar ith. expr. in array index 




#def ine 


RVNE 


56 




reserved word not expected 


*/ 


#def i ne 


ILFB 


57 


/* 


illegal function body 




#def ine 


CBPR 


58 


/* 


input couldn’ t be parsed 


*/ 


#def i ne 


TBBP 


59 


/* 


too big block to parse 


*/ 


#def ine 


UEOF 


60 


/* 


unexpected end of file 




#def i ne 


CETL 


61 


/* 


comment endless or too long 


*/ 


#def ine 


SETL 


62 


/* 


string is endless or too long 


* / 


#def ine 


UMPH 


63 


/* 


unmatched parent hes i s 


*/ 


#def ine 


SMUP 


64 


/* 


sern i co 1 on rn i ss i n g / unmat ched prnt hes i 


5-^V 


#def i ne 


CIEX 


65 


/* 


constant identifier expected 


*/ 


#def ine 


CVEX 


66 


/* 


constant value expected 


*/ 


#def ine 


STIF 


67 


/* 


symbol table is full 


*/ 


#def ine 


NLTL 


68 


/* 


numeric length too long 




#def ine 


TBNV 


69 


/* 


t oo big n urner i c value 


*/ 


#def ine 


NSIF 


70 


/* 


name string is full 




#def i ne 


DTIF 


71 


/* 


definition table is full 


*/ 


#def ine 


CTIF 


72 


/* 


constant table is full 




#def i ne 


VSIF 


73 


/* 


variable string is full 


*/ 


#def ine 


LTIF 


74 


/* 


label table is full 


*/ 


#def i ne 


DC ID 


75 




duplicated cons- id declaration 


*/ 


#def ine 


DLDC 


76 


/* 


duplicated label declaration 


*/ 


#def ine 


I CDF 


77 


/* 


illegal character definition 


*/ 


#def ine 


SLTL 


78 


/* 


string length too long 


*/ 


#def ine 


DWEX 


79 


/* 


delimiter was expected 


*/ 


#def ine 


UDLB 


80 


/* 


undec lared labe 1 




#def i ne 


DPDC 


81 


/* 


dupl icated parameter declarat ion 


*/ 


#def ine 


DPFA 


82 


/* 


declared parameter is not a fun. arg 


. */ 


#def i ne 


UNPE 


83 


/* 


undeclared parameter exists 




#def ine 


DFDC 


84 


/* 


dupl icated f unct ion declarat ion 


*/ 


#def i ne 


I CAN 


86 


/* 


i neons i st ent ar g urnent number 


*/ 


#def ine 


DDDC 


87 


/* 


dupl icated default declarat ion 


*/ 


#def ine 


IVBR 


88 


/* 


invalid break usage 


*/ 


#def ine 


TMNL 


89 


/* 


too many nested level 


*/ 



Warni nq Messages : 










#def ine 


AFRI 


1 


/* 


all functions return integer 


*/ 


#def i ne 


CSIB 


£ 


/* 


compound statement is blank 


*/ 


#def ine 


ILTL 




/* 


identifier length too long 


*/ 


#def i ne 


TOMF 


4 


/* 


main function is missing 


*/ 
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APPENDIX E 





INTERMEDIATE 


CODE 


DEFINITIONS FOR TINY 


-c 


#def ine 


I ADD 


1 


/* 


int eger add it ion 


*/ 


#def ine 


I SUB 


d 


/* 


int eger subtract ion 


*/ 


#def i ne 


IMUL 




/* 


integer multiply 


*/ 


#def ine 


IDIV 


4 


/* 


int eger d i vision 


*/ 


#def ine 


MDLS 


5 


/* 


i nt eger mod u 1 us 


*/ 


#def ine 


I MLB 


G 


/* 


label declarat ion 


*/ 


#def ine 


JUMP 


7 


/* 


uncond i t i ona 1 j unip 


*/ 


#def ine 


JPTR 


8 


/* 


jump if true 


*/ 


#def ine 


JPFL 


9 


/* 


jump if false 


*/ 


#def ine 


LGEX 


10 


• /* 


logic expression 


*/ 


#def ine 


CONS 


1 1 


/* 


const ant 


*/ 


#def ine 


ARID 


12 


/* 


array ident i f ier 


*/ 


#def ine 


VARB 


13 


/* 


variable 


*/ 


#def ine 


UNMS 


14 


/* 


unary minus 


*/ 


#def ine 


LGNT 


15 


/* 


unary logic not 


*/ 


#def ine 


EQLT 


18 


/* 


equal i t y 


*/ 


#def ine 


NTEQ 


19 




not equal 


*/ 


#def ine 


LSTN 


£0 


/* 


less than 


*/ 


#def ine 


GRTN 


£1 


/* 


greater than 


*/ 


#def ine 


LTEQ 


££ 


/* 


less than or equal 


*/ 


#def ine 


GTEQ 


iz!%^ 


/* 


greater than or equal*/ 


#def ine 


LGAN 


£4 


/* 


logic and 


*/ 


#def ine 


LGOR 


£5 


/* 


logic or 


*/ 


#def i ne 


ASSN 


£6 


/* 


assi gnrnent 


*/ 


#def ine 


ADAS 


£7 


/* 


add it i on-ass i gnrnent 


*/ 


#def ine 


SBAS 


£8 


/* 


subtract i on-ass i gn 


*/ 


#def ine 


MLAS 


£9 


/* 


mu 1 1 i ply-assi gnrnent 


*/ 


#def ine 


DVAS 


30 


/* 


d i vision-assi gnrnent 


*/ 


#def ine 


MDAS 


31 


/* 


mod u 1 us-ass i gnrnent 


*/ 


#def ine 


FNCL 


Lj d 


/* 


function call 


*/ 


#def i ne 


ARGM 




/* 


argument 


*/ 


#def ine 


EXLB 


34 


/* 


expl ici t label 


*/ 


#def ine 


GOTO 


35 


/* 


jump to exp. label 


*/ 


#def ine 


CASE 


36 


/* 


case statement 


*/ 


#def ine 


TVAR 


37 


/* 


temporary variable # 


*/ 


#def ine 


SWTC 


38 


/* 


sw i t ch st at ernent 


*/ 


#def ine 


PNTR 


39 


/* 


pointer 


*/ 


#def ine 


ADDR 


40 


/* 


address 


*/ 


#def ine 


RTRN 


41 


/* 


ret urn 


*/ 


#def ine 


INDX 


4£ 


/* 


index 


*/ 


#def ine 


FNDC 


45 




f unct ion declarat ion 


*/ 


#def i ne 


STMT 


46 


/* 


statement 


*/ 


#def ine 


DUMY 


47 


/* 


d urnrny st at ernent 


*/ 


#def ine 


BREK 


48 


/* 


break st at ernent 


*/ 


#def ine 


DFLT 


49 


/* 


default case 


*/ 


#def ine 


INCR 


50 


/* 


increment 


*/ 


#def ine 


DCRT 


51 


/* 


decrement 


*/ 


#def ine 


INCL 


5£ 


/* 


i ncr ernent , 1 at er 


*/ 
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#def ine 


DCRL 


53 


/* 


decrement , 1 at er 




#def ine 


NQOP 


54 


/* 


no operation 


*/ 


#def ine 


TEMP 


55 


/* 


t ernpor ar y variable 


*/ 


#def ine 


CNVB 


56 


/* 


convert to boolean 


*/ 


#def ine 


BTEM 


57 


/* 


boo 1 ean t ernpor ary 


*/ 


#def i ne 


FEND 


58 


/* 


f unct i on end 




#def ine 


MAIN 


59 


/* 


main f unct ion 


*/ 


#def i ne 


MEND 


60 


/* 


end of main function 
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APPENDIX F 



TEST PRQGRPMS FOR THE TINY- 



Program 1, 



ma i n ( ) 

int joe, jimmy; 



joe=5 ; 
j i rnmy= 1 5 

switch ( 

case Ic! 

default 

case lA 

> 



joe * 5 ) 



+H- joe ; 
break ; 

joe=j immy+£7 ; 
break ; 

joe= j immy — ; 
break ; 



COMPILER 
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The Code for the Program 1 



codeseg 


equ 


(0:0) 


dat aseg 


equ 


(1:0) 




org 


dat aseg 


syrnl 


ds 


1 


symE: 


ds 


1 




org 


codeseg 




Jmp 


main 


rna i n : 








move 


< int, 5>, r (0:0) 




move 


•{ int , 15>, n(0: 1 ) 




mov 


r ( 0 : 0 ) , r-' ( 0 : 2 ) 




rnal 


r ( 0 : 0 ) , r ( 0 : 2 ) 




move 


r ( 0 : 0) , syrn 1 




move 


r (0 : 1 ) , sym2 




move 


r (0 : 2) , syrn3 




J rnp 


i rn 1 b 1 0 


i rn 1 b 1 2 : 








move 


syrnl , r (0 : 0) 




add 


•{int, l>,r(0:0),r(0:l) 




move 


r (0 : 1 ) , syrnl 




jrnp 


irnlbl 1 


i rn 1 b 1 3 : 








move 


int , 27 >, r (0 : 0) 




move 


syrn2, r (0 : 1 ) 




add 


r ( 0 : 1 ) , r' ( 0 : 0 ) 




move 


r (0 : 0) , syrnl 




Jrnp 


irnlbl 1 


i rn 1 b 1 4 : 








move 


syrn2, r (0 : 0) 




sub 


•C int , 1 >, r (0:0) , r (0: 1 ) 




move 


r (0 : 0) , syrnl 




move 


n (0 : 1 ) , syrn2 




Jrnp 


irnlbl 1 


i rn 1 b 1 0 : 








move 


syrn3, r (0 : 0) 




move 


^ int, 12>, r (0: 1) 




if 


r ( 0 : 0 ) ==r (0:1), irnplbl2 




move 


< int, 14>, r (0:2) 




if 


r-' ( 0 : 0 ) ==r (0:2), irnplbl4 




Jrnp 


i rn p 1 b 1 3 


i rn 1 b 1 1 : 








stop 
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Program 
rna i n < ) 

int joe, jimmy; 
joe=3; 

j irnmy=joe*37 ; 

if ( joe > 37 8c8c jirnmy< = joe 

■C 

j irnmy=18 ; 

> 

else 

j irnmy=c!7 ; 
joe= j irnmy+SS ; 

> 



jimmy ) 
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The Code for the Program 



codeseg 
dat aseg 

syrnl 

syrnE: 



ma i n : 



b 1 b 10 : 
b 1 bl 1 : 

b 1 b 1 E: : 
blbl3 : 



b 1 b 1 4 : 
blbl5 : 



i rn 1 b 1 0 : 



iml b 1 1 : 



equ (0:0) 

eqa (1:0) 

org dat aseg 

ds 1 

ds 1 

org codeseg 

jrnp main 

move < i nt , 3 > , r ( 0 : 0 ) 

move < int , 37 >, r (0 : 1 ) 

m a 1 r' ( 0 : 0 ) , r ( 0 : 1 ) 

move *C int - 37 >, r (0 ; 3) 
if r(0:0))r<0:E:),blbl0 

move < boo 1 , fa 1 se >, r ( 0 : 3 ) 

j fii p b 1 b 1 1 

move < bool , t rue>, r (0 : 3) 

if r (0: 1 ) < = r (0:0) , blblE: 

move *C boo 1, false >, r(0:A) 

J rn p b 1 b 1 3 

move < bool , t rue>, r (0 : 4 ) 

and r ( 0 : 3 ) , r ( 0 : A ) 

if r (0: 1 )==< int, 0>, blblA 

move -C bool , t rue >, r ( 0 : 5 ) 

Jrnp b 1 b 1 5 

move -C bool , false>, r(0:5) 

and r ( 0 : A ) , r ( 0 : 5 ) 

move r (0 : 0) , syml 

move r ( 0 : 1 ) , syrnE: 

if r (0 : 5) ==-C boo 1 , false>, irnlbl0: 

move < i nt , 1 8 > , r ( 0 : 0 ) 

move r (0 : 0) , syrnE: 

j rn p i rn 1 b 1 1 

move < i nt , E:7 >, r (0 : 0) 

move r (0:0), syrnE: 

move *C int , E5>, r (0 : 0) 

move syrnE:, r (0 : 1 ) 

add r(0:l),r(0:0) 

st op 
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f unct i on ( j oe, j i mrny ) 
int joe, jimmy; 

■C 



do 

joe = jinirnyH-H- 
while < joe == 3 ) ; 

— j i mrny ; 

> 



nia i n ( ) 

-C 

int joe, jimmy; 



j i mrny =5 ; 
do 

joe = jimmy — 
while ( joe 3 ) ; 

f unct i on ( j i mrny , joe) ; 



> 



+•+• j i rnrny ; 



The Code for the Program 



codeseg 
dat aseg 

syrnl 

syniE: 

sym4 

syrn5 



f unct ion 



i rn 1 b 1 0 : 



blbl0: 
blbl 1 : 



i m 1 b 1 1 : 



rna i n : 



i rn 1 b 1 c! : 



blbl£: 



equ (0:0) 

equ (1:0) 

org dataseg 

ds 1 

ds 1 

ds 1 

ds 1 

org codeseg 

jmp main 

pop s(0) , r ( 0 : 0) 

pop s (0) , r (0 : 1 ) 

move r (0:0), syrnl 

move r ( 0 : 1 ) , syniE: 

move syrncl, r ( 0 : 0 ) 

add *C int , 1 >, r (0 : 0) , r ( 0 : 1 ) 

move < i nt , 3 > , r ( 0 : E' ) 

if r ( 0 : 0 ) ==r ( 0 : E* ) , b 1 b 1 0 

move < boo 1 , f al se >, r (0 : 3 ) 

Jump blbll 

move < bool , t rue>, r (0 : 3) 

move r (0:0), syrnl 

move r ( 0 : 1 ) , syrnE 

if r ( 0 : 3 ) ==< boo 1 , t r ue > , i rn 1 b 1 0 

move syrnE, r ( 0 : 0) 

sub -C int , 1 >, r (0:0) , r (0: 1 ) 

push *C int , 1 >, s (0) 

rt s s ( 1 ) 



move -C i nt , 5 > , r ( 0 : 0 ) 

move r ( 0 : 0 ) , sy rn5 

move r (0 : 1 ) , syrnE 

move syrn5, r (0 : 0) 

sub *C int , 1 >, r (0 : 0) , r (0 : 1 ) 

move < int , 3>, r (0 : E) 

if r (0:0) ==r (0:E) , bl blE 

move < boo 1 , f a 1 se >, r ( 0 : 3) 

Jump blbl3 

move “C bool , true>, r (0 : 3) 
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blbl 



move 


r OZi : 0) , syniA 


move 


r (0 : 1 ) , syni5 


i f 


r (0: 3) ==•<! bool , t rue>, iml bl 


i rn 1 b 1 3 : 




move 


symA, r (0 : 0) 


push 


( 0 : 0 ) , s ( 0 ) 


move 


sym5, r (0 : 1 ) 


push 


r ( 0 : 1 ) , s ( 0 ) 


jsr 


f uncal 1 , s ( 1 ) 


pop 


s ( 0 ) , r*' ( 0 : c! ) 


add 


-Cirit, l>,r(0:l),r(0:3) 


stop 
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